diff mbox

174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel

Message ID CAJZ5v0gRYcC5fJZF207iyehvPR9_2zqprSvWHA_Qt93W9njqAw@mail.gmail.com (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Rafael J. Wysocki Jan. 10, 2017, 1:27 a.m. UTC
On Tue, Jan 10, 2017 at 12:52 AM, Borislav Petkov <bp@alien8.de> wrote:
> On Tue, Jan 10, 2017 at 12:40:39AM +0100, Borislav Petkov wrote:
>> Lemme run it.
>
> Well, it boots but I get:
>
> [    0.291447] ------------[ cut here ]------------
> [    0.291702] WARNING: CPU: 0 PID: 1 at kernel/rcu/tree.c:3993 rcu_scheduler_starting+0x5c/0x70
> [    0.292107] Modules linked in:
> [    0.292277] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.10.0-rc3+ #21
> [    0.292540] Hardware name: HP HP EliteBook 745 G3/807E, BIOS N73 Ver. 01.08 01/28/2016
> [    0.292893] Call Trace:
> [    0.293072]  ? dump_stack+0x46/0x63
> [    0.293285]  ? __warn+0xec/0x110
> [    0.293487]  ? rcu_scheduler_starting+0x5c/0x70
> [    0.293735]  ? kernel_init_freeable+0x58/0x19a
> [    0.293976]  ? rest_init+0x80/0x80
> [    0.294153]  ? kernel_init+0xa/0x100
> [    0.294334]  ? ret_from_fork+0x22/0x30
> [    0.294525] ---[ end trace 4c0fe009ed4dc740 ]---
>
> TBH, I like Rafael's suggestion in the other mail to stick with fixing
> this in ACPI, especially this is an ACPI problem, not RCU. Well,
> more or less: RCU could be taught to *not* do schedule_work() if
> workqueue_init() hasn't happened yet but that's a tangential.
>
> So, I'm going to bed. When I wake up, I want to see working fixes!
>
> :-)))

Well, if the https://patchwork.kernel.org/patch/9504277/ patch from Lv
worked, the attached one should work too (please test), but it can be
justified in a slightly more convincing way.

Namely, the idea is that acpi_os_read/write_memory() should never be
used before invoking acpi_os_initialize() and since those are the only
places where the list of memory regions is walked under RCU without
extra locking, it is safe to skip the RCU synchronization until that
happens.

Thanks,
Rafael

Comments

Paul E. McKenney Jan. 10, 2017, 2:23 a.m. UTC | #1
On Tue, Jan 10, 2017 at 02:27:16AM +0100, Rafael J. Wysocki wrote:
> On Tue, Jan 10, 2017 at 12:52 AM, Borislav Petkov <bp@alien8.de> wrote:
> > On Tue, Jan 10, 2017 at 12:40:39AM +0100, Borislav Petkov wrote:
> >> Lemme run it.
> >
> > Well, it boots but I get:
> >
> > [    0.291447] ------------[ cut here ]------------
> > [    0.291702] WARNING: CPU: 0 PID: 1 at kernel/rcu/tree.c:3993 rcu_scheduler_starting+0x5c/0x70
> > [    0.292107] Modules linked in:
> > [    0.292277] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.10.0-rc3+ #21
> > [    0.292540] Hardware name: HP HP EliteBook 745 G3/807E, BIOS N73 Ver. 01.08 01/28/2016
> > [    0.292893] Call Trace:
> > [    0.293072]  ? dump_stack+0x46/0x63
> > [    0.293285]  ? __warn+0xec/0x110
> > [    0.293487]  ? rcu_scheduler_starting+0x5c/0x70
> > [    0.293735]  ? kernel_init_freeable+0x58/0x19a
> > [    0.293976]  ? rest_init+0x80/0x80
> > [    0.294153]  ? kernel_init+0xa/0x100
> > [    0.294334]  ? ret_from_fork+0x22/0x30
> > [    0.294525] ---[ end trace 4c0fe009ed4dc740 ]---
> >
> > TBH, I like Rafael's suggestion in the other mail to stick with fixing
> > this in ACPI, especially this is an ACPI problem, not RCU. Well,
> > more or less: RCU could be taught to *not* do schedule_work() if
> > workqueue_init() hasn't happened yet but that's a tangential.
> >
> > So, I'm going to bed. When I wake up, I want to see working fixes!
> >
> > :-)))
> 
> Well, if the https://patchwork.kernel.org/patch/9504277/ patch from Lv
> worked, the attached one should work too (please test), but it can be
> justified in a slightly more convincing way.
> 
> Namely, the idea is that acpi_os_read/write_memory() should never be
> used before invoking acpi_os_initialize() and since those are the only
> places where the list of memory regions is walked under RCU without
> extra locking, it is safe to skip the RCU synchronization until that
> happens.
> 
> Thanks,
> Rafael

Makes sense to me!

It looks like I can make the grace-period-free boot-time window
for CONFIG_PREEMPT=y kernels quite a bit narrower, but it does not
look like something suitable for jamming into 4.10.

							Thanx, Paul

> ---
>  drivers/acpi/osl.c |    8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> Index: linux-pm/drivers/acpi/osl.c
> ===================================================================
> --- linux-pm.orig/drivers/acpi/osl.c
> +++ linux-pm/drivers/acpi/osl.c
> @@ -378,7 +378,9 @@ static void acpi_os_drop_map_ref(struct
>  static void acpi_os_map_cleanup(struct acpi_ioremap *map)
>  {
>  	if (!map->refcount) {
> -		synchronize_rcu_expedited();
> +		if (acpi_os_initialized)
> +			synchronize_rcu_expedited();
> +
>  		acpi_unmap(map->phys, map->virt);
>  		kfree(map);
>  	}
> @@ -671,6 +673,8 @@ acpi_os_read_memory(acpi_physical_addres
>  	bool unmap = false;
>  	u64 dummy;
>  
> +	WARN_ON_ONCE(!acpi_os_initialized);
> +
>  	rcu_read_lock();
>  	virt_addr = acpi_map_vaddr_lookup(phys_addr, size);
>  	if (!virt_addr) {
> @@ -716,6 +720,8 @@ acpi_os_write_memory(acpi_physical_addre
>  	unsigned int size = width / 8;
>  	bool unmap = false;
>  
> +	WARN_ON_ONCE(!acpi_os_initialized);
> +
>  	rcu_read_lock();
>  	virt_addr = acpi_map_vaddr_lookup(phys_addr, size);
>  	if (!virt_addr) {

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lv Zheng Jan. 10, 2017, 5:41 a.m. UTC | #2
Hi, Rafael and Paul

> From: Paul E. McKenney [mailto:paulmck@linux.vnet.ibm.com]
> Subject: Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and
> early_acpi_os_unmap_memory() from Linux kernel
> 
> On Tue, Jan 10, 2017 at 02:27:16AM +0100, Rafael J. Wysocki wrote:
> > On Tue, Jan 10, 2017 at 12:52 AM, Borislav Petkov <bp@alien8.de> wrote:
> > > On Tue, Jan 10, 2017 at 12:40:39AM +0100, Borislav Petkov wrote:
> > >> Lemme run it.
> > >
> > > Well, it boots but I get:
> > >
> > > [    0.291447] ------------[ cut here ]------------
> > > [    0.291702] WARNING: CPU: 0 PID: 1 at kernel/rcu/tree.c:3993 rcu_scheduler_starting+0x5c/0x70
> > > [    0.292107] Modules linked in:
> > > [    0.292277] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.10.0-rc3+ #21
> > > [    0.292540] Hardware name: HP HP EliteBook 745 G3/807E, BIOS N73 Ver. 01.08 01/28/2016
> > > [    0.292893] Call Trace:
> > > [    0.293072]  ? dump_stack+0x46/0x63
> > > [    0.293285]  ? __warn+0xec/0x110
> > > [    0.293487]  ? rcu_scheduler_starting+0x5c/0x70
> > > [    0.293735]  ? kernel_init_freeable+0x58/0x19a
> > > [    0.293976]  ? rest_init+0x80/0x80
> > > [    0.294153]  ? kernel_init+0xa/0x100
> > > [    0.294334]  ? ret_from_fork+0x22/0x30
> > > [    0.294525] ---[ end trace 4c0fe009ed4dc740 ]---
> > >
> > > TBH, I like Rafael's suggestion in the other mail to stick with fixing
> > > this in ACPI, especially this is an ACPI problem, not RCU. Well,
> > > more or less: RCU could be taught to *not* do schedule_work() if
> > > workqueue_init() hasn't happened yet but that's a tangential.
> > >
> > > So, I'm going to bed. When I wake up, I want to see working fixes!
> > >
> > > :-)))
> >
> > Well, if the https://patchwork.kernel.org/patch/9504277/ patch from Lv
> > worked, the attached one should work too (please test), but it can be
> > justified in a slightly more convincing way.
> >
> > Namely, the idea is that acpi_os_read/write_memory() should never be
> > used before invoking acpi_os_initialize() and since those are the only
> > places where the list of memory regions is walked under RCU without
> > extra locking, it is safe to skip the RCU synchronization until that
> > happens.
> >
> > Thanks,
> > Rafael
> 
> Makes sense to me!

Also looks good to me.

> 
> It looks like I can make the grace-period-free boot-time window
> for CONFIG_PREEMPT=y kernels quite a bit narrower, but it does not
> look like something suitable for jamming into 4.10.

OK, we can have this fixed in ACPI layer first.

Thanks
Lv

> 
> 							Thanx, Paul
> 
> > ---
> >  drivers/acpi/osl.c |    8 +++++++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> >
> > Index: linux-pm/drivers/acpi/osl.c
> > ===================================================================
> > --- linux-pm.orig/drivers/acpi/osl.c
> > +++ linux-pm/drivers/acpi/osl.c
> > @@ -378,7 +378,9 @@ static void acpi_os_drop_map_ref(struct
> >  static void acpi_os_map_cleanup(struct acpi_ioremap *map)
> >  {
> >  	if (!map->refcount) {
> > -		synchronize_rcu_expedited();
> > +		if (acpi_os_initialized)
> > +			synchronize_rcu_expedited();
> > +
> >  		acpi_unmap(map->phys, map->virt);
> >  		kfree(map);
> >  	}
> > @@ -671,6 +673,8 @@ acpi_os_read_memory(acpi_physical_addres
> >  	bool unmap = false;
> >  	u64 dummy;
> >
> > +	WARN_ON_ONCE(!acpi_os_initialized);
> > +
> >  	rcu_read_lock();
> >  	virt_addr = acpi_map_vaddr_lookup(phys_addr, size);
> >  	if (!virt_addr) {
> > @@ -716,6 +720,8 @@ acpi_os_write_memory(acpi_physical_addre
> >  	unsigned int size = width / 8;
> >  	bool unmap = false;
> >
> > +	WARN_ON_ONCE(!acpi_os_initialized);
> > +
> >  	rcu_read_lock();
> >  	virt_addr = acpi_map_vaddr_lookup(phys_addr, size);
> >  	if (!virt_addr) {

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul E. McKenney Jan. 10, 2017, 5:51 a.m. UTC | #3
On Tue, Jan 10, 2017 at 05:41:45AM +0000, Zheng, Lv wrote:
> Hi, Rafael and Paul
> 
> > From: Paul E. McKenney [mailto:paulmck@linux.vnet.ibm.com]
> > Subject: Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and
> > early_acpi_os_unmap_memory() from Linux kernel
> > 
> > On Tue, Jan 10, 2017 at 02:27:16AM +0100, Rafael J. Wysocki wrote:
> > > On Tue, Jan 10, 2017 at 12:52 AM, Borislav Petkov <bp@alien8.de> wrote:
> > > > On Tue, Jan 10, 2017 at 12:40:39AM +0100, Borislav Petkov wrote:
> > > >> Lemme run it.
> > > >
> > > > Well, it boots but I get:
> > > >
> > > > [    0.291447] ------------[ cut here ]------------
> > > > [    0.291702] WARNING: CPU: 0 PID: 1 at kernel/rcu/tree.c:3993 rcu_scheduler_starting+0x5c/0x70
> > > > [    0.292107] Modules linked in:
> > > > [    0.292277] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.10.0-rc3+ #21
> > > > [    0.292540] Hardware name: HP HP EliteBook 745 G3/807E, BIOS N73 Ver. 01.08 01/28/2016
> > > > [    0.292893] Call Trace:
> > > > [    0.293072]  ? dump_stack+0x46/0x63
> > > > [    0.293285]  ? __warn+0xec/0x110
> > > > [    0.293487]  ? rcu_scheduler_starting+0x5c/0x70
> > > > [    0.293735]  ? kernel_init_freeable+0x58/0x19a
> > > > [    0.293976]  ? rest_init+0x80/0x80
> > > > [    0.294153]  ? kernel_init+0xa/0x100
> > > > [    0.294334]  ? ret_from_fork+0x22/0x30
> > > > [    0.294525] ---[ end trace 4c0fe009ed4dc740 ]---
> > > >
> > > > TBH, I like Rafael's suggestion in the other mail to stick with fixing
> > > > this in ACPI, especially this is an ACPI problem, not RCU. Well,
> > > > more or less: RCU could be taught to *not* do schedule_work() if
> > > > workqueue_init() hasn't happened yet but that's a tangential.
> > > >
> > > > So, I'm going to bed. When I wake up, I want to see working fixes!
> > > >
> > > > :-)))
> > >
> > > Well, if the https://patchwork.kernel.org/patch/9504277/ patch from Lv
> > > worked, the attached one should work too (please test), but it can be
> > > justified in a slightly more convincing way.
> > >
> > > Namely, the idea is that acpi_os_read/write_memory() should never be
> > > used before invoking acpi_os_initialize() and since those are the only
> > > places where the list of memory regions is walked under RCU without
> > > extra locking, it is safe to skip the RCU synchronization until that
> > > happens.
> > >
> > > Thanks,
> > > Rafael
> > 
> > Makes sense to me!
> 
> Also looks good to me.
> 
> > 
> > It looks like I can make the grace-period-free boot-time window
> > for CONFIG_PREEMPT=y kernels quite a bit narrower, but it does not
> > look like something suitable for jamming into 4.10.
> 
> OK, we can have this fixed in ACPI layer first.

Definitely.

I have the RCU changes written in ink on paper and they definitely are
-not- something that goes into 4.10.  4.11 at the earliest, and if no
one asks for it in 4.11, it goes into 4.12.  Serious testing needed.  ;-)

							Thanx, Paul

> Thanks
> Lv
> 
> > 
> > 							Thanx, Paul
> > 
> > > ---
> > >  drivers/acpi/osl.c |    8 +++++++-
> > >  1 file changed, 7 insertions(+), 1 deletion(-)
> > >
> > > Index: linux-pm/drivers/acpi/osl.c
> > > ===================================================================
> > > --- linux-pm.orig/drivers/acpi/osl.c
> > > +++ linux-pm/drivers/acpi/osl.c
> > > @@ -378,7 +378,9 @@ static void acpi_os_drop_map_ref(struct
> > >  static void acpi_os_map_cleanup(struct acpi_ioremap *map)
> > >  {
> > >  	if (!map->refcount) {
> > > -		synchronize_rcu_expedited();
> > > +		if (acpi_os_initialized)
> > > +			synchronize_rcu_expedited();
> > > +
> > >  		acpi_unmap(map->phys, map->virt);
> > >  		kfree(map);
> > >  	}
> > > @@ -671,6 +673,8 @@ acpi_os_read_memory(acpi_physical_addres
> > >  	bool unmap = false;
> > >  	u64 dummy;
> > >
> > > +	WARN_ON_ONCE(!acpi_os_initialized);
> > > +
> > >  	rcu_read_lock();
> > >  	virt_addr = acpi_map_vaddr_lookup(phys_addr, size);
> > >  	if (!virt_addr) {
> > > @@ -716,6 +720,8 @@ acpi_os_write_memory(acpi_physical_addre
> > >  	unsigned int size = width / 8;
> > >  	bool unmap = false;
> > >
> > > +	WARN_ON_ONCE(!acpi_os_initialized);
> > > +
> > >  	rcu_read_lock();
> > >  	virt_addr = acpi_map_vaddr_lookup(phys_addr, size);
> > >  	if (!virt_addr) {
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Borislav Petkov Jan. 11, 2017, 9:21 a.m. UTC | #4
On Mon, Jan 09, 2017 at 09:51:29PM -0800, Paul E. McKenney wrote:
> Definitely.

Btw, we have more breakage from RCU expedited using workqueues:
https://bugzilla.kernel.org/show_bug.cgi?id=192111

I've added you to CC but let me have other bug reporters confirm reverting

  8b355e3bc140 ("rcu: Drive expedited grace periods from workqueue")

does fix the issue for them too.

Thanks.
Paul E. McKenney Jan. 11, 2017, 9:51 a.m. UTC | #5
On Wed, Jan 11, 2017 at 10:21:06AM +0100, Borislav Petkov wrote:
> On Mon, Jan 09, 2017 at 09:51:29PM -0800, Paul E. McKenney wrote:
> > Definitely.
> 
> Btw, we have more breakage from RCU expedited using workqueues:
> https://bugzilla.kernel.org/show_bug.cgi?id=192111
> 
> I've added you to CC but let me have other bug reporters confirm reverting
> 
>   8b355e3bc140 ("rcu: Drive expedited grace periods from workqueue")
> 
> does fix the issue for them too.

Yes, you could make RCU expedited grace periods go back to using the
requesting task, and that would allow expedited grace periods to run early
in the boot process.  But that causes problems with signals and the like
unless you revert a few other patches.  The bugzilla is interesting --
it looks like ACPI was in some cases doing early-boot grace-period waits
some time back?

I have a limping prototype RCU patch that should avoid this problem.

If all goes well, I will send it out late tomorrow evening, Pacific Time.

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Borislav Petkov Jan. 11, 2017, 10:03 a.m. UTC | #6
On Wed, Jan 11, 2017 at 01:51:56AM -0800, Paul E. McKenney wrote:
> Yes, you could make RCU expedited grace periods go back to using the
> requesting task, and that would allow expedited grace periods to run early
> in the boot process.  But that causes problems with signals and the like
> unless you revert a few other patches.  The bugzilla is interesting --
> it looks like ACPI was in some cases doing early-boot grace-period waits
> some time back?

I think this and https://bugzilla.suse.com/show_bug.cgi?id=1017783 is an
example of a bunch of toshiba schlaptops which cause the issue. So it
looks like ACPI is doing something very early on those which tickles the
issue to happen.

But this is ACPI - anything can happen!

> I have a limping prototype RCU patch that should avoid this problem.
>
> If all goes well, I will send it out late tomorrow evening, Pacific Time.

Attach it to the bugzilla too, pls, because the people there trigger the
issue.

I have the respective(?) SUSE bug and I can ask people there to run it
too.

Thanks.
Paul E. McKenney Jan. 11, 2017, 10:22 a.m. UTC | #7
On Wed, Jan 11, 2017 at 11:03:23AM +0100, Borislav Petkov wrote:
> On Wed, Jan 11, 2017 at 01:51:56AM -0800, Paul E. McKenney wrote:
> > Yes, you could make RCU expedited grace periods go back to using the
> > requesting task, and that would allow expedited grace periods to run early
> > in the boot process.  But that causes problems with signals and the like
> > unless you revert a few other patches.  The bugzilla is interesting --
> > it looks like ACPI was in some cases doing early-boot grace-period waits
> > some time back?
> 
> I think this and https://bugzilla.suse.com/show_bug.cgi?id=1017783 is an
> example of a bunch of toshiba schlaptops which cause the issue. So it
> looks like ACPI is doing something very early on those which tickles the
> issue to happen.
> 
> But this is ACPI - anything can happen!

;-) ;-) ;-)

> > I have a limping prototype RCU patch that should avoid this problem.
> >
> > If all goes well, I will send it out late tomorrow evening, Pacific Time.
> 
> Attach it to the bugzilla too, pls, because the people there trigger the
> issue.
> 
> I have the respective(?) SUSE bug and I can ask people there to run it
> too.

That would be very good!  Thinking good thoughts for the ongoing tests...

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

---
 drivers/acpi/osl.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Index: linux-pm/drivers/acpi/osl.c
===================================================================
--- linux-pm.orig/drivers/acpi/osl.c
+++ linux-pm/drivers/acpi/osl.c
@@ -378,7 +378,9 @@  static void acpi_os_drop_map_ref(struct
 static void acpi_os_map_cleanup(struct acpi_ioremap *map)
 {
 	if (!map->refcount) {
-		synchronize_rcu_expedited();
+		if (acpi_os_initialized)
+			synchronize_rcu_expedited();
+
 		acpi_unmap(map->phys, map->virt);
 		kfree(map);
 	}
@@ -671,6 +673,8 @@  acpi_os_read_memory(acpi_physical_addres
 	bool unmap = false;
 	u64 dummy;
 
+	WARN_ON_ONCE(!acpi_os_initialized);
+
 	rcu_read_lock();
 	virt_addr = acpi_map_vaddr_lookup(phys_addr, size);
 	if (!virt_addr) {
@@ -716,6 +720,8 @@  acpi_os_write_memory(acpi_physical_addre
 	unsigned int size = width / 8;
 	bool unmap = false;
 
+	WARN_ON_ONCE(!acpi_os_initialized);
+
 	rcu_read_lock();
 	virt_addr = acpi_map_vaddr_lookup(phys_addr, size);
 	if (!virt_addr) {