Message ID | 20090729215425.23674.80263.stgit@bob.kio (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
On Thu, 2009-07-30 at 05:54 +0800, Bjorn Helgaas wrote: > On some machines, a software-initiated SMI causes corruption unless the > SMI runs on CPU 0. An SMI can be initiated by any AML, but typically it's > done in GPE-related methods that are run via workqueues, so we can avoid > the known corruption cases by binding the workqueues to CPU 0. > > References: > http://bugzilla.kernel.org/show_bug.cgi?id=13751 > https://bugs.launchpad.net/bugs/157171 > https://bugs.launchpad.net/bugs/157691 > > Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Acked-by: Zhang Rui <rui.zhang@intel.com> > --- > drivers/acpi/osl.c | 25 +++++++++++++++++++++++++ > 1 files changed, 25 insertions(+), 0 deletions(-) > > diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c > index 7167071..5691f16 100644 > --- a/drivers/acpi/osl.c > +++ b/drivers/acpi/osl.c > @@ -189,11 +189,36 @@ acpi_status __init acpi_os_initialize(void) > return AE_OK; > } > > +static void bind_to_cpu0(struct work_struct *work) > +{ > + set_cpus_allowed(current, cpumask_of_cpu(0)); > + kfree(work); > +} > + > +static void bind_workqueue(struct workqueue_struct *wq) > +{ > + struct work_struct *work; > + > + work = kzalloc(sizeof(struct work_struct), GFP_KERNEL); > + INIT_WORK(work, bind_to_cpu0); > + queue_work(wq, work); > +} > + > acpi_status acpi_os_initialize1(void) > { > + /* > + * On some machines, a software-initiated SMI causes corruption unless > + * the SMI runs on CPU 0. An SMI can be initiated by any AML, but > + * typically it's done in GPE-related methods that are run via > + * workqueues, so we can avoid the known corruption cases by binding > + * the workqueues to CPU 0. > + */ > kacpid_wq = create_singlethread_workqueue("kacpid"); > + bind_workqueue(kacpid_wq); > kacpi_notify_wq = create_singlethread_workqueue("kacpi_notify"); > + bind_workqueue(kacpi_notify_wq); > kacpi_hotplug_wq = create_singlethread_workqueue("kacpi_hotplug"); > + bind_workqueue(kacpi_hotplug_wq); > BUG_ON(!kacpid_wq); > BUG_ON(!kacpi_notify_wq); > BUG_ON(!kacpi_hotplug_wq); > > -- > To unsubscribe from this list: send the line "unsubscribe linux-acpi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jul 30, 2009 at 05:54:25AM +0800, Bjorn Helgaas wrote: > On some machines, a software-initiated SMI causes corruption unless the > SMI runs on CPU 0. An SMI can be initiated by any AML, but typically it's > done in GPE-related methods that are run via workqueues, so we can avoid > the known corruption cases by binding the workqueues to CPU 0. > > References: > http://bugzilla.kernel.org/show_bug.cgi?id=13751 > https://bugs.launchpad.net/bugs/157171 > https://bugs.launchpad.net/bugs/157691 Good job! Since any AML code can invoke a SMI, I wonder if all ACPICA should be limited to run on CPU 0? -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jul 30, 2009 at 10:43:00AM +0800, Shaohua Li wrote: > On Thu, Jul 30, 2009 at 05:54:25AM +0800, Bjorn Helgaas wrote: > > On some machines, a software-initiated SMI causes corruption unless the > > SMI runs on CPU 0. An SMI can be initiated by any AML, but typically it's > > done in GPE-related methods that are run via workqueues, so we can avoid > > the known corruption cases by binding the workqueues to CPU 0. > > > > References: > > http://bugzilla.kernel.org/show_bug.cgi?id=13751 > > https://bugs.launchpad.net/bugs/157171 > > https://bugs.launchpad.net/bugs/157691 > Good job! Since any AML code can invoke a SMI, I wonder if all ACPICA should be > limited to run on CPU 0? If ACPI is a performance bottleneck then we have other problems, so I suspect that we could live with that. We'd probably want to be able to disable it at runtime for the small number of users who have "interesting" performance requirements, but falling on the side of safety over slightly reduced latency under some circumstances seems fair to me. It'd be interesting to see if this helps with any of the other SMI-related hangs we've seen.
On Thu, Jul 30, 2009 at 10:55:54AM +0800, Matthew Garrett wrote: > On Thu, Jul 30, 2009 at 10:43:00AM +0800, Shaohua Li wrote: > > On Thu, Jul 30, 2009 at 05:54:25AM +0800, Bjorn Helgaas wrote: > > > On some machines, a software-initiated SMI causes corruption unless the > > > SMI runs on CPU 0. An SMI can be initiated by any AML, but typically it's > > > done in GPE-related methods that are run via workqueues, so we can avoid > > > the known corruption cases by binding the workqueues to CPU 0. > > > > > > References: > > > http://bugzilla.kernel.org/show_bug.cgi?id=13751 > > > https://bugs.launchpad.net/bugs/157171 > > > https://bugs.launchpad.net/bugs/157691 > > Good job! Since any AML code can invoke a SMI, I wonder if all ACPICA should be > > limited to run on CPU 0? > > If ACPI is a performance bottleneck then we have other problems, so I > suspect that we could live with that. We'd probably want to be able to > disable it at runtime for the small number of users who have > "interesting" performance requirements, but falling on the side of > safety over slightly reduced latency under some circumstances seems fair > to me. It'd be interesting to see if this helps with any of the other > SMI-related hangs we've seeni. ACPICA isn't designed for performance. If it has performance issue, it should already have. -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jul 30, 2009 at 11:13:48AM +0800, Shaohua Li wrote: > ACPICA isn't designed for performance. If it has performance issue, it should > already have. Yeah. My point was just that we have some customers who like tuning systems heavily - I suspect they'd prefer to be able to control whether or not ACPI is running entirely on cpu 0 or not. As you say, it should make little difference in the real world but some people do have very specialised requirements.
On Wednesday 29 July 2009 08:43:00 pm Shaohua Li wrote: > On Thu, Jul 30, 2009 at 05:54:25AM +0800, Bjorn Helgaas wrote: > > On some machines, a software-initiated SMI causes corruption unless the > > SMI runs on CPU 0. An SMI can be initiated by any AML, but typically it's > > done in GPE-related methods that are run via workqueues, so we can avoid > > the known corruption cases by binding the workqueues to CPU 0. > > > > References: > > http://bugzilla.kernel.org/show_bug.cgi?id=13751 > > https://bugs.launchpad.net/bugs/157171 > > https://bugs.launchpad.net/bugs/157691 > Good job! Since any AML code can invoke a SMI, I wonder if all ACPICA should be > limited to run on CPU 0? I did look into doing that, but I didn't see an easy way to do it. My first thought was that we could do a set_cpus_allowed() in acpi_ex_enter_interpreter() and restore in acpi_ex_exit_interpreter(). But of course, those are ACPI CA functions, so to do it without an ACPI CA change would mean some kind of hook in acpi_os_wait_semaphore(), and there, we don't know *which* semaphore means "enter interpreter". So I gave up for now. But if somebody has a smarter idea, I agree that it would be nice to at least have the option to run all AML on CPU 0. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday 29 July 2009 06:59:59 pm Zhang Rui wrote: > On Thu, 2009-07-30 at 05:54 +0800, Bjorn Helgaas wrote: > > On some machines, a software-initiated SMI causes corruption unless the > > SMI runs on CPU 0. An SMI can be initiated by any AML, but typically it's > > done in GPE-related methods that are run via workqueues, so we can avoid > > the known corruption cases by binding the workqueues to CPU 0. > > > > References: > > http://bugzilla.kernel.org/show_bug.cgi?id=13751 > > https://bugs.launchpad.net/bugs/157171 > > https://bugs.launchpad.net/bugs/157691 > > > > Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> > > Acked-by: Zhang Rui <rui.zhang@intel.com> In addition to the reports above, I think it's likely this patch will fix the problems reported below: http://bugzilla.kernel.org/show_bug.cgi?id=13412 http://bugzilla.kernel.org/show_bug.cgi?id=11259 http://bugzilla.kernel.org/show_bug.cgi?id=12328 http://bugzilla.kernel.org/show_bug.cgi?id=12106 I think we should consider this patch for 2.6.31. (Rafael, 13751 is on your "2.6.29 -> 2.6.30" regression list. I actually think it's been around much longer than that, but there seem to be many things that affect whether it manifests.) Bjorn > > --- > > drivers/acpi/osl.c | 25 +++++++++++++++++++++++++ > > 1 files changed, 25 insertions(+), 0 deletions(-) > > > > diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c > > index 7167071..5691f16 100644 > > --- a/drivers/acpi/osl.c > > +++ b/drivers/acpi/osl.c > > @@ -189,11 +189,36 @@ acpi_status __init acpi_os_initialize(void) > > return AE_OK; > > } > > > > +static void bind_to_cpu0(struct work_struct *work) > > +{ > > + set_cpus_allowed(current, cpumask_of_cpu(0)); > > + kfree(work); > > +} > > + > > +static void bind_workqueue(struct workqueue_struct *wq) > > +{ > > + struct work_struct *work; > > + > > + work = kzalloc(sizeof(struct work_struct), GFP_KERNEL); > > + INIT_WORK(work, bind_to_cpu0); > > + queue_work(wq, work); > > +} > > + > > acpi_status acpi_os_initialize1(void) > > { > > + /* > > + * On some machines, a software-initiated SMI causes corruption unless > > + * the SMI runs on CPU 0. An SMI can be initiated by any AML, but > > + * typically it's done in GPE-related methods that are run via > > + * workqueues, so we can avoid the known corruption cases by binding > > + * the workqueues to CPU 0. > > + */ > > kacpid_wq = create_singlethread_workqueue("kacpid"); > > + bind_workqueue(kacpid_wq); > > kacpi_notify_wq = create_singlethread_workqueue("kacpi_notify"); > > + bind_workqueue(kacpi_notify_wq); > > kacpi_hotplug_wq = create_singlethread_workqueue("kacpi_hotplug"); > > + bind_workqueue(kacpi_hotplug_wq); > > BUG_ON(!kacpid_wq); > > BUG_ON(!kacpi_notify_wq); > > BUG_ON(!kacpi_hotplug_wq); > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-acpi" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Saturday 01 August 2009, Bjorn Helgaas wrote: > On Wednesday 29 July 2009 06:59:59 pm Zhang Rui wrote: > > On Thu, 2009-07-30 at 05:54 +0800, Bjorn Helgaas wrote: > > > On some machines, a software-initiated SMI causes corruption unless the > > > SMI runs on CPU 0. An SMI can be initiated by any AML, but typically it's > > > done in GPE-related methods that are run via workqueues, so we can avoid > > > the known corruption cases by binding the workqueues to CPU 0. > > > > > > References: > > > http://bugzilla.kernel.org/show_bug.cgi?id=13751 > > > https://bugs.launchpad.net/bugs/157171 > > > https://bugs.launchpad.net/bugs/157691 > > > > > > Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> > > > > Acked-by: Zhang Rui <rui.zhang@intel.com> > > In addition to the reports above, I think it's likely this patch > will fix the problems reported below: > > http://bugzilla.kernel.org/show_bug.cgi?id=13412 > http://bugzilla.kernel.org/show_bug.cgi?id=11259 > http://bugzilla.kernel.org/show_bug.cgi?id=12328 > http://bugzilla.kernel.org/show_bug.cgi?id=12106 > > I think we should consider this patch for 2.6.31. > > (Rafael, 13751 is on your "2.6.29 -> 2.6.30" regression list. > I actually think it's been around much longer than that, but > there seem to be many things that affect whether it manifests.) I've dropped it from the list, thanks. Best, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c index 7167071..5691f16 100644 --- a/drivers/acpi/osl.c +++ b/drivers/acpi/osl.c @@ -189,11 +189,36 @@ acpi_status __init acpi_os_initialize(void) return AE_OK; } +static void bind_to_cpu0(struct work_struct *work) +{ + set_cpus_allowed(current, cpumask_of_cpu(0)); + kfree(work); +} + +static void bind_workqueue(struct workqueue_struct *wq) +{ + struct work_struct *work; + + work = kzalloc(sizeof(struct work_struct), GFP_KERNEL); + INIT_WORK(work, bind_to_cpu0); + queue_work(wq, work); +} + acpi_status acpi_os_initialize1(void) { + /* + * On some machines, a software-initiated SMI causes corruption unless + * the SMI runs on CPU 0. An SMI can be initiated by any AML, but + * typically it's done in GPE-related methods that are run via + * workqueues, so we can avoid the known corruption cases by binding + * the workqueues to CPU 0. + */ kacpid_wq = create_singlethread_workqueue("kacpid"); + bind_workqueue(kacpid_wq); kacpi_notify_wq = create_singlethread_workqueue("kacpi_notify"); + bind_workqueue(kacpi_notify_wq); kacpi_hotplug_wq = create_singlethread_workqueue("kacpi_hotplug"); + bind_workqueue(kacpi_hotplug_wq); BUG_ON(!kacpid_wq); BUG_ON(!kacpi_notify_wq); BUG_ON(!kacpi_hotplug_wq);
On some machines, a software-initiated SMI causes corruption unless the SMI runs on CPU 0. An SMI can be initiated by any AML, but typically it's done in GPE-related methods that are run via workqueues, so we can avoid the known corruption cases by binding the workqueues to CPU 0. References: http://bugzilla.kernel.org/show_bug.cgi?id=13751 https://bugs.launchpad.net/bugs/157171 https://bugs.launchpad.net/bugs/157691 Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> --- drivers/acpi/osl.c | 25 +++++++++++++++++++++++++ 1 files changed, 25 insertions(+), 0 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html