diff mbox

ACPI: return first _ADR match for acpi_get_child

Message ID 8150264.b2f1OROKxE@al (mailing list archive)
State Rejected, archived
Headers show

Commit Message

Peter Wu Nov. 12, 2012, 10:28 p.m. UTC
From: Peter Wu <lekensteyn@gmail.com>

When acpi_get_child is called, it loops through all direct child
devices and calls do_acpi_find_child to find an ACPI handle matching the
_ADR. The previous implementation visits every direct child, if there
were multiple devices with a matching _ADR, the last match was always
returned.

This behaviour (returning the last result) does not work for many recent
Lenovo machines when looking for the correct ACPI handle for a Nvidia
PCI video device. On these machines, the first handle should be used
instead of the last one. On these machines, the PCI Bus ID is 01:00.0,
hence the address that is searched for is 0 (via acpi_pci_find_device).

When acpi_pci_find_device calls acpi_get_child, it iterates through:

    Address: 00000000 (valid); handle: \_SB_.PCI0.PEG0.PEGP
    Address: 00000001 (valid); handle: \_SB_.PCI0.PEG0.VGA1
    Address: 00000000 (valid); handle: \_SB_.PCI0.PEG0.VGA_

The correct handle here is "\_SB_.PCI0.PEG0.PEGP" which contains the
_ROM method as well as _PSx for PM, and not \_SB.PCI0.PEG0.VGA.

On my own laptop, and many others I believe, there is only one matching
handle for a PCI device. The acpi_get_child method has been added in
2005, 4e10d12a3d88c88fba3258809aa42d14fd8cf1d1, and there is no message
indicating that the last (or first) handle should be returned. However,
it is kind of useless to iterate from beginning to the end if you only
need the last match. Therefore, just return the first match (which is
likely the last one too in almost all cases).

Verified to work for an affected Lenovo Ideapad Y480, tested for no
regressions on my laptop (Clevo B7130).

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=42696

Tested-by: Sergio Perez <dagobertstaler@gmail.com>
Signed-off-by: Peter Wu <lekensteyn@gmail.com>
---
 drivers/acpi/glue.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Comments

Len Brown Nov. 16, 2012, 4:25 p.m. UTC | #1
Peter,

It is great that you debugged this issue
and proved where the problem is.

However, this patch can't possibly be the right way to go --
as it is just as broken as the code it replaces.
Were I to bet, I'd say that it will break as many machines
as it fixes.  And when it does, where are we?

Clearly we need to be using a more clever search algorithm.

thanks,
Len Brown, Intel Open Source Technology Center


On 11/12/2012 05:28 PM, Lekensteyn wrote:
> From: Peter Wu <lekensteyn@gmail.com>
> 
> When acpi_get_child is called, it loops through all direct child
> devices and calls do_acpi_find_child to find an ACPI handle matching the
> _ADR. The previous implementation visits every direct child, if there
> were multiple devices with a matching _ADR, the last match was always
> returned.
> 
> This behaviour (returning the last result) does not work for many recent
> Lenovo machines when looking for the correct ACPI handle for a Nvidia
> PCI video device. On these machines, the first handle should be used
> instead of the last one. On these machines, the PCI Bus ID is 01:00.0,
> hence the address that is searched for is 0 (via acpi_pci_find_device).
> 
> When acpi_pci_find_device calls acpi_get_child, it iterates through:
> 
>     Address: 00000000 (valid); handle: \_SB_.PCI0.PEG0.PEGP
>     Address: 00000001 (valid); handle: \_SB_.PCI0.PEG0.VGA1
>     Address: 00000000 (valid); handle: \_SB_.PCI0.PEG0.VGA_
> 
> The correct handle here is "\_SB_.PCI0.PEG0.PEGP" which contains the
> _ROM method as well as _PSx for PM, and not \_SB.PCI0.PEG0.VGA.
> 
> On my own laptop, and many others I believe, there is only one matching
> handle for a PCI device. The acpi_get_child method has been added in
> 2005, 4e10d12a3d88c88fba3258809aa42d14fd8cf1d1, and there is no message
> indicating that the last (or first) handle should be returned. However,
> it is kind of useless to iterate from beginning to the end if you only
> need the last match. Therefore, just return the first match (which is
> likely the last one too in almost all cases).
> 
> Verified to work for an affected Lenovo Ideapad Y480, tested for no
> regressions on my laptop (Clevo B7130).
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=42696
> 
> Tested-by: Sergio Perez <dagobertstaler@gmail.com>
> Signed-off-by: Peter Wu <lekensteyn@gmail.com>
> ---
>  drivers/acpi/glue.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/acpi/glue.c b/drivers/acpi/glue.c
> index 0837308..dc9c945 100644
> --- a/drivers/acpi/glue.c
> +++ b/drivers/acpi/glue.c
> @@ -99,18 +99,20 @@ struct acpi_find_child {
>  static acpi_status
>  do_acpi_find_child(acpi_handle handle, u32 lvl, void *context, void **rv)
>  {
> -	acpi_status status;
> +	acpi_status status, ret = AE_OK;
>  	struct acpi_device_info *info;
>  	struct acpi_find_child *find = context;
>  
>  	status = acpi_get_object_info(handle, &info);
>  	if (ACPI_SUCCESS(status)) {
>  		if ((info->address == find->address)
> -			&& (info->valid & ACPI_VALID_ADR))
> +			&& (info->valid & ACPI_VALID_ADR)) {
>  			find->handle = handle;
> +			ret = AE_CTRL_TERMINATE;
> +		}
>  		kfree(info);
>  	}
> -	return AE_OK;
> +	return ret;
>  }
>  
>  acpi_handle acpi_get_child(acpi_handle parent, u64 address)
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Wu Nov. 16, 2012, 6:37 p.m. UTC | #2
Thank you for your reply.

On Friday 16 November 2012 11:25:47 Len Brown wrote:
> However, this patch can't possibly be the right way to go --
> as it is just as broken as the code it replaces.
This is a well-aimed shot in the dark based on the DSDT I see. It is certainly 
not a solid proof that is won't break other machines and if it does break. As 
far as PCI is concerned, it only affects machines with multiple handles that 
have the same PCI address returned by _ADR. This method is also used by USB 
and others which I am more worried about (still, I think it will not happen, 
but hey, this is ACPI).

Nowhere in the ACPI spec is stated what should happen with duplicate _ADR 
values for such devices. Maybe you can ask for a clarification on this at the 
ACPI team? (given that the ACPI spec is also worked on by Intel)?
(the Windows Hardware Certification Requirements [1] is also not helpful on 
this topic)

> Were I to bet, I'd say that it will break as many machines
> as it fixes.  And when it does, where are we?
> 
> Clearly we need to be using a more clever search algorithm.
Right, but what additional information can we throw at this in order to fix the 
linked bug? If you scan through it and wonder about the 0xFFFF value at _ADR, 
this is changed by PEGP._INI to 0.

If this patch will not be included, I propose to print at least a debugging 
message when a duplicate handle is found (i.e. if find->handle != NULL).

> thanks,
> Len Brown, Intel Open Source Technology Center
> 
> On 11/12/2012 05:28 PM, Lekensteyn wrote:
> > From: Peter Wu <lekensteyn@gmail.com>
> > 
> > When acpi_get_child is called, it loops through all direct child
> > devices and calls do_acpi_find_child to find an ACPI handle matching the
> > _ADR. The previous implementation visits every direct child, if there
> > were multiple devices with a matching _ADR, the last match was always
> > returned.
> > 
> > This behaviour (returning the last result) does not work for many recent
> > Lenovo machines when looking for the correct ACPI handle for a Nvidia
> > PCI video device. On these machines, the first handle should be used
> > instead of the last one. On these machines, the PCI Bus ID is 01:00.0,
> > hence the address that is searched for is 0 (via acpi_pci_find_device).
> > 
> > When acpi_pci_find_device calls acpi_get_child, it iterates through:
> >     Address: 00000000 (valid); handle: \_SB_.PCI0.PEG0.PEGP
> >     Address: 00000001 (valid); handle: \_SB_.PCI0.PEG0.VGA1
> >     Address: 00000000 (valid); handle: \_SB_.PCI0.PEG0.VGA_
> > 
> > The correct handle here is "\_SB_.PCI0.PEG0.PEGP" which contains the
> > _ROM method as well as _PSx for PM, and not \_SB.PCI0.PEG0.VGA.
> > 
> > On my own laptop, and many others I believe, there is only one matching
> > handle for a PCI device. The acpi_get_child method has been added in
> > 2005, 4e10d12a3d88c88fba3258809aa42d14fd8cf1d1, and there is no message
> > indicating that the last (or first) handle should be returned. However,
> > it is kind of useless to iterate from beginning to the end if you only
> > need the last match. Therefore, just return the first match (which is
> > likely the last one too in almost all cases).
> > 
> > Verified to work for an affected Lenovo Ideapad Y480, tested for no
> > regressions on my laptop (Clevo B7130).
> > 
> > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=42696
> > 
> > Tested-by: Sergio Perez <dagobertstaler@gmail.com>
> > Signed-off-by: Peter Wu <lekensteyn@gmail.com>
> > ---
> > 
> >  drivers/acpi/glue.c | 8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/acpi/glue.c b/drivers/acpi/glue.c
> > index 0837308..dc9c945 100644
> > --- a/drivers/acpi/glue.c
> > +++ b/drivers/acpi/glue.c
> > @@ -99,18 +99,20 @@ struct acpi_find_child {
> > 
> >  static acpi_status
> >  do_acpi_find_child(acpi_handle handle, u32 lvl, void *context, void **rv)
> >  {
> > 
> > -	acpi_status status;
> > +	acpi_status status, ret = AE_OK;
> > 
> >  	struct acpi_device_info *info;
> >  	struct acpi_find_child *find = context;
> >  	
> >  	status = acpi_get_object_info(handle, &info);
> >  	if (ACPI_SUCCESS(status)) {
> >  	
> >  		if ((info->address == find->address)
> > 
> > -			&& (info->valid & ACPI_VALID_ADR))
> > +			&& (info->valid & ACPI_VALID_ADR)) {
> > 
> >  			find->handle = handle;
> > 
> > +			ret = AE_CTRL_TERMINATE;
> > +		}
> > 
> >  		kfree(info);
> >  	
> >  	}
> > 
> > -	return AE_OK;
> > +	return ret;
> > 
> >  }
> >  
> >  acpi_handle acpi_get_child(acpi_handle parent, u64 address)

 [1]: http://msdn.microsoft.com/en-US/library/windows/hardware/jj128256
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael Wysocki Dec. 19, 2012, 11:31 a.m. UTC | #3
On Friday, November 16, 2012 07:37:34 PM Lekensteyn wrote:
> Thank you for your reply.
> 
> On Friday 16 November 2012 11:25:47 Len Brown wrote:
> > However, this patch can't possibly be the right way to go --
> > as it is just as broken as the code it replaces.
> This is a well-aimed shot in the dark based on the DSDT I see.

Can you post that DSDT by chance?

> It is certainly 
> not a solid proof that is won't break other machines and if it does break. As 
> far as PCI is concerned, it only affects machines with multiple handles that 
> have the same PCI address returned by _ADR.

I wonder if there are any other criteria we can use to choose the "best" handle
in those cases?  It doesn't look like choosing the first on or the last one
is really going to always work.

> This method is also used by USB 
> and others which I am more worried about (still, I think it will not happen, 
> but hey, this is ACPI).
> 
> Nowhere in the ACPI spec is stated what should happen with duplicate _ADR 
> values for such devices. Maybe you can ask for a clarification on this at the 
> ACPI team? (given that the ACPI spec is also worked on by Intel)?
> (the Windows Hardware Certification Requirements [1] is also not helpful on 
> this topic)

Well, added Bob Moore to the CC list, perhaps he knows something about that.

Thanks,
Rafael
Peter Wu Dec. 19, 2012, 10:40 p.m. UTC | #4
Hi,

On Wednesday 19 December 2012 12:31:35 Rafael J. Wysocki wrote:
> > This is a well-aimed shot in the dark based on the DSDT I see.
> 
> Can you post that DSDT by chance?
The DSDT for a Lenovo Ideapad Y470 has been posted before to 
https://bugzilla.kernel.org/show_bug.cgi?id=42696.

From dsdt.dsl,
line 9279 (context Device(\_SB.PCI0.PEG0)):

    Name (_ADR, 0x00010000)

line 9377 (context Device(\_SB.PCI0.PEG0.PEGP)):

    Name (_ADR, 0xFFFF)

line 9414 (context Device(\_SB.PCI0.PEG0.VGA)):
 Name (_ADR, Zero)

From ssdt6.dsl, line 1070, (context Device(\_SB.PCI0.PEG0.PEGP)):
    Method (_INI, 0, NotSerialized)
    {
        Store (Zero, \_SB.PCI0.PEG0.PEGP._ADR)
    }

> > It is certainly 
> > not a solid proof that is won't break other machines and if it does break.
> > As  far as PCI is concerned, it only affects machines with multiple
> > handles that have the same PCI address returned by _ADR.
> 
> I wonder if there are any other criteria we can use to choose the "best"
> handle in those cases?  It doesn't look like choosing the first on or the
> last one is really going to always work.
Currently I do not have other ideas to get this to work, but Intel may have an 
ACPI expert who can help out here based on experience.

Regards,
Peter
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael Wysocki Dec. 26, 2012, 2:13 p.m. UTC | #5
On Wednesday, December 19, 2012 11:40:47 PM Lekensteyn wrote:
> Hi,
> 
> On Wednesday 19 December 2012 12:31:35 Rafael J. Wysocki wrote:
> > > This is a well-aimed shot in the dark based on the DSDT I see.
> > 
> > Can you post that DSDT by chance?
> The DSDT for a Lenovo Ideapad Y470 has been posted before to 
> https://bugzilla.kernel.org/show_bug.cgi?id=42696.
> 
> From dsdt.dsl,
> line 9279 (context Device(\_SB.PCI0.PEG0)):
> 
>     Name (_ADR, 0x00010000)
> 
> line 9377 (context Device(\_SB.PCI0.PEG0.PEGP)):
> 
>     Name (_ADR, 0xFFFF)
> 
> line 9414 (context Device(\_SB.PCI0.PEG0.VGA)):
>  Name (_ADR, Zero)
> 
> From ssdt6.dsl, line 1070, (context Device(\_SB.PCI0.PEG0.PEGP)):
>     Method (_INI, 0, NotSerialized)
>     {
>         Store (Zero, \_SB.PCI0.PEG0.PEGP._ADR)
>     }
> 

OK

But as far as I can say, there also are _PS0, _PS1 and _PS3 methods under
PEG0.VGA and there are display devices under it.  Moreover, it seems to have
_DOS defined (which PEGP doesn't have), so I'm not really sure we should
prefer PEG0.PEGP to PEG0.VGA after all?

I wonder what PEG0.PEGP and PEG0.VGA are supposed to represent, actually.

Thanks,
Rafael
Peter Wu Dec. 26, 2012, 3:15 p.m. UTC | #6
On Wednesday 26 December 2012 15:13:00 Rafael J. Wysocki wrote:
> But as far as I can say, there also are _PS0, _PS1 and _PS3 methods under
> PEG0.VGA and there are display devices under it.
_PS0, _PS1 and _PS3 just set _PSC for ACPI. It does interact with any 
hardware. The display adapter use "CADL" (\CADL based on the scope of 
CRT._DCS, not to be confused with \_SB.PCI0.GFX0.CADL (field IGDM)). This field 
is only written in NVIF. It seems to be a display adapter used for a nvidia 
graphics card.

I know that some Lenovo laptops allow you to select Intel / Nvidia / Optimus 
in their BIOS, maybe this PEG0.VGA is used for the nvidia-only setup (lazy 
Lenovo engineers?). Note that this does *not* apply to this IdeaPad Y570 or a 
Lenovo IdeaPad Y470, these laptops force you into the Optimus mode which means 
that you have two graphics cards activated.

> Moreover, it seems to have
> _DOS defined (which PEGP doesn't have), so I'm not really sure we should
> prefer PEG0.PEGP to PEG0.VGA after all?
The code is a subset of what \_SB.PCI0.GFX0._DOS does.
VGA._DOS:

    Store (And (Arg0, 0x07), DSEN)
GFX0._DOS:

    Store (And (Arg0, 0x07), DSEN)
    If (LEqual (And (Arg0, 0x03), Zero)) {
        If (CondRefOf (HDOS)) {
            HDOS ()
        }
    }

> I wonder what PEG0.PEGP and PEG0.VGA are supposed to represent, actually.
PEGP represents the more powerful Nvidia graphics card. GFX0 is the more 
power-aggressive Intel graphics card. VGA... is something useless? A "it 
works, do not touch anything" leftover? There is also a completely useless 
VGA1 device. To quote Linus [1]:

> It's that simple. Don't expect anything else. Firmware is written by
> people who have lost the will to live (why? Because they do firmware
> development for a living), and the only thing keeping them going is
> the drugs. And they're not the "fun" kind of drugs. The end result is
> predictable. In their drug-induced haze, they make a mess.

Regards,
Peter

 [1]: http://lkml.org/lkml/2012/10/21/75
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/acpi/glue.c b/drivers/acpi/glue.c
index 0837308..dc9c945 100644
--- a/drivers/acpi/glue.c
+++ b/drivers/acpi/glue.c
@@ -99,18 +99,20 @@  struct acpi_find_child {
 static acpi_status
 do_acpi_find_child(acpi_handle handle, u32 lvl, void *context, void **rv)
 {
-	acpi_status status;
+	acpi_status status, ret = AE_OK;
 	struct acpi_device_info *info;
 	struct acpi_find_child *find = context;
 
 	status = acpi_get_object_info(handle, &info);
 	if (ACPI_SUCCESS(status)) {
 		if ((info->address == find->address)
-			&& (info->valid & ACPI_VALID_ADR))
+			&& (info->valid & ACPI_VALID_ADR)) {
 			find->handle = handle;
+			ret = AE_CTRL_TERMINATE;
+		}
 		kfree(info);
 	}
-	return AE_OK;
+	return ret;
 }
 
 acpi_handle acpi_get_child(acpi_handle parent, u64 address)