diff mbox

[REGRESSION,3.14-rc6] Samsung N150 lid does not "open" after suspend to RAM.

Message ID CAOLK0pxh4b3uzjD4jaPssD66K8g33uqahV1SzOAJAmQmnY4XqQ@mail.gmail.com (mailing list archive)
State RFC, archived
Headers show

Commit Message

Tianyu Lan March 24, 2014, 9:30 a.m. UTC
Please try the following patch.




2014-03-24 15:50 GMT+08:00 Stefan Biereigel <security@biereigel-wb.de>:
> Hi,
>
> starting with 3.14-rc6, the lid on my Samsung N150 behaves weird: My
> system is set up, so that it should suspend to RAM as soon as the lid is
> closed. Beginning with 3.14-rc6, the lid goes from "open" to "closed"
> correctly the first time (and the system suspends), but after resuming
> from standby (by opening the lid), the lid does not change to "open" again.
> Of course, closing the lid again does not induce suspend to RAM then.
> Opening the lid now (while not sleeping), makes ACPI notify the opening,
> so I guess ACPI "misses" or discards the lid open event from the EC when
> coming from sleep.
> Now, closing the lid again does induce suspend to RAM. This behaviour is
> reproducible: every other time, suspending works.
>
> This behaviour seems to be introduced by commit ad332c8a: ACPI / EC:
> Clear stale EC events on Samsung systems.
> Which was introduced after 3.14-rc5.
>
> When opening the lid to resume from standby, i see in dmesg:
> Mar 23 22:12:04 little1 kernel: [ 7630.932074] ACPI : EC: 1 stale EC
> events cleared
> (which comes from drivers/acpi/ec.c)
>
> Seems to me, that the "open" event is cleared from the EC, but also
> discarded instead of passed on. Shouldn't the correct behaviour be to
> report all the pending events, read from the EC, as ACPI events? Can you
> point me in a direction for fixing the issue cleanly, then I will try to
> find a solution and prepare a patch for this issue.
>
> Best regards,
> Stefan
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Stefan Biereigel March 24, 2014, 11:19 a.m. UTC | #1
Hi,
thank you for the suggestion. The patch resolves the issue on my N150
when applied to a clean 3.14-rc7. Anyways I'm wondering if similar
problems to mine now exist on the Samsung Series 7/9 notebooks?

Is any further action from my part required?

Regards,
Stefan

Am 24.03.2014 10:30, schrieb Lan Tianyu:
> Please try the following patch.
> 
> diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
> index d7d32c2..9239527 100644
> --- a/drivers/acpi/ec.c
> +++ b/drivers/acpi/ec.c
> @@ -1027,8 +1027,13 @@ static struct dmi_system_id ec_dmi_table[] __initdata = {
>         DMI_MATCH(DMI_SYS_VENDOR, "ASUSTek Computer Inc."),
>         DMI_MATCH(DMI_PRODUCT_NAME, "L4R"),}, NULL},
>         {
> -       ec_clear_on_resume, "Samsung hardware", {
> -       DMI_MATCH(DMI_SYS_VENDOR, "SAMSUNG ELECTRONICS CO., LTD.")}, NULL},
> +       ec_clear_on_resume, "Samsung NP530U3B", {
> +       DMI_MATCH(DMI_SYS_VENDOR, "SAMSUNG ELECTRONICS CO., LTD."),
> +       DMI_MATCH(DMI_PRODUCT_NAME, "530U3BI/530U4BI/530U4BH"),}, NULL},
> +       {
> +       ec_clear_on_resume, "Samsung NP530U3C", {
> +       DMI_MATCH(DMI_SYS_VENDOR, "SAMSUNG ELECTRONICS CO., LTD."),
> +       DMI_MATCH(DMI_PRODUCT_NAME, "530U3C/530U4C/532U3C"),}, NULL},
>         {},
>  };
> 
> 
> 
> 
> 2014-03-24 15:50 GMT+08:00 Stefan Biereigel <security@biereigel-wb.de>:
>> Hi,
>>
>> starting with 3.14-rc6, the lid on my Samsung N150 behaves weird: My
>> system is set up, so that it should suspend to RAM as soon as the lid is
>> closed. Beginning with 3.14-rc6, the lid goes from "open" to "closed"
>> correctly the first time (and the system suspends), but after resuming
>> from standby (by opening the lid), the lid does not change to "open" again.
>> Of course, closing the lid again does not induce suspend to RAM then.
>> Opening the lid now (while not sleeping), makes ACPI notify the opening,
>> so I guess ACPI "misses" or discards the lid open event from the EC when
>> coming from sleep.
>> Now, closing the lid again does induce suspend to RAM. This behaviour is
>> reproducible: every other time, suspending works.
>>
>> This behaviour seems to be introduced by commit ad332c8a: ACPI / EC:
>> Clear stale EC events on Samsung systems.
>> Which was introduced after 3.14-rc5.
>>
>> When opening the lid to resume from standby, i see in dmesg:
>> Mar 23 22:12:04 little1 kernel: [ 7630.932074] ACPI : EC: 1 stale EC
>> events cleared
>> (which comes from drivers/acpi/ec.c)
>>
>> Seems to me, that the "open" event is cleared from the EC, but also
>> discarded instead of passed on. Shouldn't the correct behaviour be to
>> report all the pending events, read from the EC, as ACPI events? Can you
>> point me in a direction for fixing the issue cleanly, then I will try to
>> find a solution and prepare a patch for this issue.
>>
>> Best regards,
>> Stefan
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tianyu Lan March 25, 2014, 9:34 a.m. UTC | #2
2014-03-24 19:19 GMT+08:00 Stefan Biereigel <security@biereigel-wb.de>:
> Hi,
> thank you for the suggestion. The patch resolves the issue on my N150
> when applied to a clean 3.14-rc7. Anyways I'm wondering if similar
> problems to mine now exist on the Samsung Series 7/9 notebooks?
>
> Is any further action from my part required?

Do you have these machines? If yes, please provide the output of
dmidecode command.

Cc guys  of commit ad332c8a.

>
> Regards,
> Stefan
>
> Am 24.03.2014 10:30, schrieb Lan Tianyu:
>> Please try the following patch.
>>
>> diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
>> index d7d32c2..9239527 100644
>> --- a/drivers/acpi/ec.c
>> +++ b/drivers/acpi/ec.c
>> @@ -1027,8 +1027,13 @@ static struct dmi_system_id ec_dmi_table[] __initdata = {
>>         DMI_MATCH(DMI_SYS_VENDOR, "ASUSTek Computer Inc."),
>>         DMI_MATCH(DMI_PRODUCT_NAME, "L4R"),}, NULL},
>>         {
>> -       ec_clear_on_resume, "Samsung hardware", {
>> -       DMI_MATCH(DMI_SYS_VENDOR, "SAMSUNG ELECTRONICS CO., LTD.")}, NULL},
>> +       ec_clear_on_resume, "Samsung NP530U3B", {
>> +       DMI_MATCH(DMI_SYS_VENDOR, "SAMSUNG ELECTRONICS CO., LTD."),
>> +       DMI_MATCH(DMI_PRODUCT_NAME, "530U3BI/530U4BI/530U4BH"),}, NULL},
>> +       {
>> +       ec_clear_on_resume, "Samsung NP530U3C", {
>> +       DMI_MATCH(DMI_SYS_VENDOR, "SAMSUNG ELECTRONICS CO., LTD."),
>> +       DMI_MATCH(DMI_PRODUCT_NAME, "530U3C/530U4C/532U3C"),}, NULL},
>>         {},
>>  };
>>
>>
>>
>>
>> 2014-03-24 15:50 GMT+08:00 Stefan Biereigel <security@biereigel-wb.de>:
>>> Hi,
>>>
>>> starting with 3.14-rc6, the lid on my Samsung N150 behaves weird: My
>>> system is set up, so that it should suspend to RAM as soon as the lid is
>>> closed. Beginning with 3.14-rc6, the lid goes from "open" to "closed"
>>> correctly the first time (and the system suspends), but after resuming
>>> from standby (by opening the lid), the lid does not change to "open" again.
>>> Of course, closing the lid again does not induce suspend to RAM then.
>>> Opening the lid now (while not sleeping), makes ACPI notify the opening,
>>> so I guess ACPI "misses" or discards the lid open event from the EC when
>>> coming from sleep.
>>> Now, closing the lid again does induce suspend to RAM. This behaviour is
>>> reproducible: every other time, suspending works.
>>>
>>> This behaviour seems to be introduced by commit ad332c8a: ACPI / EC:
>>> Clear stale EC events on Samsung systems.
>>> Which was introduced after 3.14-rc5.
>>>
>>> When opening the lid to resume from standby, i see in dmesg:
>>> Mar 23 22:12:04 little1 kernel: [ 7630.932074] ACPI : EC: 1 stale EC
>>> events cleared
>>> (which comes from drivers/acpi/ec.c)
>>>
>>> Seems to me, that the "open" event is cleared from the EC, but also
>>> discarded instead of passed on. Shouldn't the correct behaviour be to
>>> report all the pending events, read from the EC, as ACPI events? Can you
>>> point me in a direction for fixing the issue cleanly, then I will try to
>>> find a solution and prepare a patch for this issue.
>>>
>>> Best regards,
>>> Stefan
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>>
Stefan Biereigel March 25, 2014, 9:43 a.m. UTC | #3
Sorry, I sadly do not have any of these machines. If I get my hands on
one, I will post dmidecode.

Thanks for your help,
Stefan


Am 25.03.2014 10:34, schrieb Lan Tianyu:
> 2014-03-24 19:19 GMT+08:00 Stefan Biereigel <security@biereigel-wb.de>:
>> Hi,
>> thank you for the suggestion. The patch resolves the issue on my N150
>> when applied to a clean 3.14-rc7. Anyways I'm wondering if similar
>> problems to mine now exist on the Samsung Series 7/9 notebooks?
>>
>> Is any further action from my part required?
> 
> Do you have these machines? If yes, please provide the output of
> dmidecode command.
> 
> Cc guys  of commit ad332c8a.
> 
>>
>> Regards,
>> Stefan
>>
>> Am 24.03.2014 10:30, schrieb Lan Tianyu:
>>> Please try the following patch.
>>>
>>> diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
>>> index d7d32c2..9239527 100644
>>> --- a/drivers/acpi/ec.c
>>> +++ b/drivers/acpi/ec.c
>>> @@ -1027,8 +1027,13 @@ static struct dmi_system_id ec_dmi_table[] __initdata = {
>>>         DMI_MATCH(DMI_SYS_VENDOR, "ASUSTek Computer Inc."),
>>>         DMI_MATCH(DMI_PRODUCT_NAME, "L4R"),}, NULL},
>>>         {
>>> -       ec_clear_on_resume, "Samsung hardware", {
>>> -       DMI_MATCH(DMI_SYS_VENDOR, "SAMSUNG ELECTRONICS CO., LTD.")}, NULL},
>>> +       ec_clear_on_resume, "Samsung NP530U3B", {
>>> +       DMI_MATCH(DMI_SYS_VENDOR, "SAMSUNG ELECTRONICS CO., LTD."),
>>> +       DMI_MATCH(DMI_PRODUCT_NAME, "530U3BI/530U4BI/530U4BH"),}, NULL},
>>> +       {
>>> +       ec_clear_on_resume, "Samsung NP530U3C", {
>>> +       DMI_MATCH(DMI_SYS_VENDOR, "SAMSUNG ELECTRONICS CO., LTD."),
>>> +       DMI_MATCH(DMI_PRODUCT_NAME, "530U3C/530U4C/532U3C"),}, NULL},
>>>         {},
>>>  };
>>>
>>>
>>>
>>>
>>> 2014-03-24 15:50 GMT+08:00 Stefan Biereigel <security@biereigel-wb.de>:
>>>> Hi,
>>>>
>>>> starting with 3.14-rc6, the lid on my Samsung N150 behaves weird: My
>>>> system is set up, so that it should suspend to RAM as soon as the lid is
>>>> closed. Beginning with 3.14-rc6, the lid goes from "open" to "closed"
>>>> correctly the first time (and the system suspends), but after resuming
>>>> from standby (by opening the lid), the lid does not change to "open" again.
>>>> Of course, closing the lid again does not induce suspend to RAM then.
>>>> Opening the lid now (while not sleeping), makes ACPI notify the opening,
>>>> so I guess ACPI "misses" or discards the lid open event from the EC when
>>>> coming from sleep.
>>>> Now, closing the lid again does induce suspend to RAM. This behaviour is
>>>> reproducible: every other time, suspending works.
>>>>
>>>> This behaviour seems to be introduced by commit ad332c8a: ACPI / EC:
>>>> Clear stale EC events on Samsung systems.
>>>> Which was introduced after 3.14-rc5.
>>>>
>>>> When opening the lid to resume from standby, i see in dmesg:
>>>> Mar 23 22:12:04 little1 kernel: [ 7630.932074] ACPI : EC: 1 stale EC
>>>> events cleared
>>>> (which comes from drivers/acpi/ec.c)
>>>>
>>>> Seems to me, that the "open" event is cleared from the EC, but also
>>>> discarded instead of passed on. Shouldn't the correct behaviour be to
>>>> report all the pending events, read from the EC, as ACPI events? Can you
>>>> point me in a direction for fixing the issue cleanly, then I will try to
>>>> find a solution and prepare a patch for this issue.
>>>>
>>>> Best regards,
>>>> Stefan
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>>
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kieran Clancy March 25, 2014, 1:23 p.m. UTC | #4
On Tue, Mar 25, 2014 at 8:04 PM, Lan Tianyu <lantianyu1986@gmail.com> wrote:
> 2014-03-24 19:19 GMT+08:00 Stefan Biereigel <security@biereigel-wb.de>:
>> Hi,
>> thank you for the suggestion. The patch resolves the issue on my N150
>> when applied to a clean 3.14-rc7. Anyways I'm wondering if similar
>> problems to mine now exist on the Samsung Series 7/9 notebooks?
>>
>> Is any further action from my part required?
>
> Do you have these machines? If yes, please provide the output of
> dmidecode command.
>
> Cc guys  of commit ad332c8a.

Hi guys,

That's a surprising side-effect, although I guess I shouldn't be
surprised by Samsung ACPI weirdness anymore.

If we can, I'd like to get to the bottom of this rather than just turn
off this fix (which we know works for series 5, 7 and 9 without
problems).

>>> 2014-03-24 15:50 GMT+08:00 Stefan Biereigel <security@biereigel-wb.de>:
>>>> Hi,
>>>>
>>>> starting with 3.14-rc6, the lid on my Samsung N150 behaves weird: My
>>>> system is set up, so that it should suspend to RAM as soon as the lid is
>>>> closed. Beginning with 3.14-rc6, the lid goes from "open" to "closed"
>>>> correctly the first time (and the system suspends), but after resuming
>>>> from standby (by opening the lid), the lid does not change to "open" again.
>>>> Of course, closing the lid again does not induce suspend to RAM then.
>>>> Opening the lid now (while not sleeping), makes ACPI notify the opening,
>>>> so I guess ACPI "misses" or discards the lid open event from the EC when
>>>> coming from sleep.
>>>> Now, closing the lid again does induce suspend to RAM. This behaviour is
>>>> reproducible: every other time, suspending works.
>>>>
>>>> This behaviour seems to be introduced by commit ad332c8a: ACPI / EC:
>>>> Clear stale EC events on Samsung systems.
>>>> Which was introduced after 3.14-rc5.
>>>>
>>>> When opening the lid to resume from standby, i see in dmesg:
>>>> Mar 23 22:12:04 little1 kernel: [ 7630.932074] ACPI : EC: 1 stale EC
>>>> events cleared
>>>> (which comes from drivers/acpi/ec.c)
>>>>
>>>> Seems to me, that the "open" event is cleared from the EC, but also
>>>> discarded instead of passed on. Shouldn't the correct behaviour be to
>>>> report all the pending events, read from the EC, as ACPI events? Can you
>>>> point me in a direction for fixing the issue cleanly, then I will try to
>>>> find a solution and prepare a patch for this issue.

Stefan, thank you for reporting this issue.

Our rationale for discarding the events was that events queued during
sleep are probably no longer relevant. There could also be other
unwanted side-effects of blindly executing all of the old
instructions. But in your case, this assumption might be wrong.

What command are you using to check if the lid is "open" or "closed"?
Is it because the screen is not waking up, or some other effect, or
just because it won't suspend again when it's re-closed?

Do other events like AC plug/unplug affect any of this if you do them
during this bad state?

I'd like to see exactly which EC command byte is being thrown away
here. If you do something like this (with dynamic debug enabled)

echo -n 'file ec.c +p' | sudo tee /sys/kernel/debug/dynamic_debug/control

You should get massively verbose EC stuff filling your dmesg, but I am
just interested in the EC read/write bytes just before and around the
"1 stale EC events cleared" message. Grab this out of dmesg before it
fills with other stuff.

This will tell us what command we are being asked to run. If you can,
please do it a few times to see if it's the same command each time or
something different.

You can turn the debug output off again with:

echo -n 'file ec.c -p' | sudo tee /sys/kernel/debug/dynamic_debug/control

I might also need a copy of your DSDT, if you can send me that
separately in another email (not to the list):

cat /sys/firmware/acpi/tables/DSDT > .DSDT.aml

Thank you,
Kieran.
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Juan Manuel Cabo March 25, 2014, 1:53 p.m. UTC | #5
I bet that his _WAK dsdt method isn't re-checking the lid state on resume
(maybe I'm wrong).

I retested on my system just to make sure, and the lid is correctly reported open
always after resume from a sleep by closing the lid, using a different kernel with that same patch.

My system is the Samsung Series 5 NP530U3C in which the patch works fine:
      
           Manufacturer: SAMSUNG ELECTRONICS CO., LTD.
           Product Name: 530U3C/530U4C

"sudo acpi_listen" shows the following on resume:

      video GFX0 00000080 00000000
      button/lid LID0 00000080 00000001
      button/lid LID0 00000080 00000002

"cat /proc/acpi/button/lid/LID0/state" shows its open always on resume. I suspended
several times, and on each time it was open.

I bet that that system's _WAK method doesn't recheck the lid, I might be wrong.
*Stefan*: Can you extract that method and send it? Something like this:

           sudo cat /sys/firmware/acpi/tables/DSDT > /tmp/dsdt.bin
           cd /tmp/dsdt.bin
           iasl -d dsdt.bin
           # AND EDIT the resulting dsdt.dsl and search for and send us the _WAK method

Also, on my system, the DSDT Q event for lid state change reads it from the EC to the LIDS var,
instead of just toggling the "LIDS" variable (Lid Status). My own dsdt is downloadable from my
blog, at http://zenstep.com.ar/files/DSDT_SamsungSeries5-NP530U3c-AB1_WithBios_P14AAJ.dsl

Cheers!
--
Juan Manuel Cabo


On 03/25/2014 10:23 AM, Kieran Clancy wrote:
> On Tue, Mar 25, 2014 at 8:04 PM, Lan Tianyu <lantianyu1986@gmail.com> wrote:
>> 2014-03-24 19:19 GMT+08:00 Stefan Biereigel <security@biereigel-wb.de>:
>>> Hi,
>>> thank you for the suggestion. The patch resolves the issue on my N150
>>> when applied to a clean 3.14-rc7. Anyways I'm wondering if similar
>>> problems to mine now exist on the Samsung Series 7/9 notebooks?
>>>
>>> Is any further action from my part required?
>> Do you have these machines? If yes, please provide the output of
>> dmidecode command.
>>
>> Cc guys  of commit ad332c8a.
> Hi guys,
>
> That's a surprising side-effect, although I guess I shouldn't be
> surprised by Samsung ACPI weirdness anymore.
>
> If we can, I'd like to get to the bottom of this rather than just turn
> off this fix (which we know works for series 5, 7 and 9 without
> problems).
>
>>>> 2014-03-24 15:50 GMT+08:00 Stefan Biereigel <security@biereigel-wb.de>:
>>>>> Hi,
>>>>>
>>>>> starting with 3.14-rc6, the lid on my Samsung N150 behaves weird: My
>>>>> system is set up, so that it should suspend to RAM as soon as the lid is
>>>>> closed. Beginning with 3.14-rc6, the lid goes from "open" to "closed"
>>>>> correctly the first time (and the system suspends), but after resuming
>>>>> from standby (by opening the lid), the lid does not change to "open" again.
>>>>> Of course, closing the lid again does not induce suspend to RAM then.
>>>>> Opening the lid now (while not sleeping), makes ACPI notify the opening,
>>>>> so I guess ACPI "misses" or discards the lid open event from the EC when
>>>>> coming from sleep.
>>>>> Now, closing the lid again does induce suspend to RAM. This behaviour is
>>>>> reproducible: every other time, suspending works.
>>>>>
>>>>> This behaviour seems to be introduced by commit ad332c8a: ACPI / EC:
>>>>> Clear stale EC events on Samsung systems.
>>>>> Which was introduced after 3.14-rc5.
>>>>>
>>>>> When opening the lid to resume from standby, i see in dmesg:
>>>>> Mar 23 22:12:04 little1 kernel: [ 7630.932074] ACPI : EC: 1 stale EC
>>>>> events cleared
>>>>> (which comes from drivers/acpi/ec.c)
>>>>>
>>>>> Seems to me, that the "open" event is cleared from the EC, but also
>>>>> discarded instead of passed on. Shouldn't the correct behaviour be to
>>>>> report all the pending events, read from the EC, as ACPI events? Can you
>>>>> point me in a direction for fixing the issue cleanly, then I will try to
>>>>> find a solution and prepare a patch for this issue.
> Stefan, thank you for reporting this issue.
>
> Our rationale for discarding the events was that events queued during
> sleep are probably no longer relevant. There could also be other
> unwanted side-effects of blindly executing all of the old
> instructions. But in your case, this assumption might be wrong.
>
> What command are you using to check if the lid is "open" or "closed"?
> Is it because the screen is not waking up, or some other effect, or
> just because it won't suspend again when it's re-closed?
>
> Do other events like AC plug/unplug affect any of this if you do them
> during this bad state?
>
> I'd like to see exactly which EC command byte is being thrown away
> here. If you do something like this (with dynamic debug enabled)
>
> echo -n 'file ec.c +p' | sudo tee /sys/kernel/debug/dynamic_debug/control
>
> You should get massively verbose EC stuff filling your dmesg, but I am
> just interested in the EC read/write bytes just before and around the
> "1 stale EC events cleared" message. Grab this out of dmesg before it
> fills with other stuff.
>
> This will tell us what command we are being asked to run. If you can,
> please do it a few times to see if it's the same command each time or
> something different.
>
> You can turn the debug output off again with:
>
> echo -n 'file ec.c -p' | sudo tee /sys/kernel/debug/dynamic_debug/control
>
> I might also need a copy of your DSDT, if you can send me that
> separately in another email (not to the list):
>
> cat /sys/firmware/acpi/tables/DSDT > .DSDT.aml
>
> Thank you,
> Kieran.
>

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stefan Biereigel March 25, 2014, 4:07 p.m. UTC | #6
Hello,

I can see the same acpi_listen events with a working kernel (patch
applied or kernels before rc5), but the lid open is not there with rc6
and rc7 (unpatched).

You can find my _WAK below. It seems not to do anything with LIDS, so I
guess this is the cause for the problem.

Method (_WAK, 1, NotSerialized)
{
    Store (0x00, \_SB.PCI0.PEXE)
    If (LEqual (Arg0, 0x03))
    {
        Store (0x01, \SPNF)
        TRAP (0x46)
        TRAP (0x50)
        \_SB.SECS (0xB3)
        P8XH (0x00, 0x03)
    }

    Store (\_SB.PCI0.LPC0.H_EC.ACEX, \PWRS)
    If (LEqual (Arg0, 0x04))
    {
        \_SB.OSHT ()
        If (DTSE)
        {
            TRAP (0x47)
        }

        P8XH (0x00, 0x04)
    }

    \_SB.SECS (0xAA)
    PNOT ()
    If (LEqual (OSYS, 0x07CE)) {}
}

Am 25.03.2014 14:53, schrieb Juan Manuel Cabo:
> I bet that his _WAK dsdt method isn't re-checking the lid state on resume
> (maybe I'm wrong).
> 
> I retested on my system just to make sure, and the lid is correctly reported open
> always after resume from a sleep by closing the lid, using a different kernel with that same patch.
> 
> My system is the Samsung Series 5 NP530U3C in which the patch works fine:
>       
>            Manufacturer: SAMSUNG ELECTRONICS CO., LTD.
>            Product Name: 530U3C/530U4C
> 
> "sudo acpi_listen" shows the following on resume:
> 
>       video GFX0 00000080 00000000
>       button/lid LID0 00000080 00000001
>       button/lid LID0 00000080 00000002
> 
> "cat /proc/acpi/button/lid/LID0/state" shows its open always on resume. I suspended
> several times, and on each time it was open.
> 
> I bet that that system's _WAK method doesn't recheck the lid, I might be wrong.
> *Stefan*: Can you extract that method and send it? Something like this:
> 
>            sudo cat /sys/firmware/acpi/tables/DSDT > /tmp/dsdt.bin
>            cd /tmp/dsdt.bin
>            iasl -d dsdt.bin
>            # AND EDIT the resulting dsdt.dsl and search for and send us the _WAK method
> 
> Also, on my system, the DSDT Q event for lid state change reads it from the EC to the LIDS var,
> instead of just toggling the "LIDS" variable (Lid Status). My own dsdt is downloadable from my
> blog, at http://zenstep.com.ar/files/DSDT_SamsungSeries5-NP530U3c-AB1_WithBios_P14AAJ.dsl
> 
> Cheers!
> --
> Juan Manuel Cabo
> 
> 
> On 03/25/2014 10:23 AM, Kieran Clancy wrote:
>> On Tue, Mar 25, 2014 at 8:04 PM, Lan Tianyu <lantianyu1986@gmail.com> wrote:
>>> 2014-03-24 19:19 GMT+08:00 Stefan Biereigel <security@biereigel-wb.de>:
>>>> Hi,
>>>> thank you for the suggestion. The patch resolves the issue on my N150
>>>> when applied to a clean 3.14-rc7. Anyways I'm wondering if similar
>>>> problems to mine now exist on the Samsung Series 7/9 notebooks?
>>>>
>>>> Is any further action from my part required?
>>> Do you have these machines? If yes, please provide the output of
>>> dmidecode command.
>>>
>>> Cc guys  of commit ad332c8a.
>> Hi guys,
>>
>> That's a surprising side-effect, although I guess I shouldn't be
>> surprised by Samsung ACPI weirdness anymore.
>>
>> If we can, I'd like to get to the bottom of this rather than just turn
>> off this fix (which we know works for series 5, 7 and 9 without
>> problems).
>>
>>>>> 2014-03-24 15:50 GMT+08:00 Stefan Biereigel <security@biereigel-wb.de>:
>>>>>> Hi,
>>>>>>
>>>>>> starting with 3.14-rc6, the lid on my Samsung N150 behaves weird: My
>>>>>> system is set up, so that it should suspend to RAM as soon as the lid is
>>>>>> closed. Beginning with 3.14-rc6, the lid goes from "open" to "closed"
>>>>>> correctly the first time (and the system suspends), but after resuming
>>>>>> from standby (by opening the lid), the lid does not change to "open" again.
>>>>>> Of course, closing the lid again does not induce suspend to RAM then.
>>>>>> Opening the lid now (while not sleeping), makes ACPI notify the opening,
>>>>>> so I guess ACPI "misses" or discards the lid open event from the EC when
>>>>>> coming from sleep.
>>>>>> Now, closing the lid again does induce suspend to RAM. This behaviour is
>>>>>> reproducible: every other time, suspending works.
>>>>>>
>>>>>> This behaviour seems to be introduced by commit ad332c8a: ACPI / EC:
>>>>>> Clear stale EC events on Samsung systems.
>>>>>> Which was introduced after 3.14-rc5.
>>>>>>
>>>>>> When opening the lid to resume from standby, i see in dmesg:
>>>>>> Mar 23 22:12:04 little1 kernel: [ 7630.932074] ACPI : EC: 1 stale EC
>>>>>> events cleared
>>>>>> (which comes from drivers/acpi/ec.c)
>>>>>>
>>>>>> Seems to me, that the "open" event is cleared from the EC, but also
>>>>>> discarded instead of passed on. Shouldn't the correct behaviour be to
>>>>>> report all the pending events, read from the EC, as ACPI events? Can you
>>>>>> point me in a direction for fixing the issue cleanly, then I will try to
>>>>>> find a solution and prepare a patch for this issue.
>> Stefan, thank you for reporting this issue.
>>
>> Our rationale for discarding the events was that events queued during
>> sleep are probably no longer relevant. There could also be other
>> unwanted side-effects of blindly executing all of the old
>> instructions. But in your case, this assumption might be wrong.
>>
>> What command are you using to check if the lid is "open" or "closed"?
>> Is it because the screen is not waking up, or some other effect, or
>> just because it won't suspend again when it's re-closed?
>>
>> Do other events like AC plug/unplug affect any of this if you do them
>> during this bad state?
>>
>> I'd like to see exactly which EC command byte is being thrown away
>> here. If you do something like this (with dynamic debug enabled)
>>
>> echo -n 'file ec.c +p' | sudo tee /sys/kernel/debug/dynamic_debug/control
>>
>> You should get massively verbose EC stuff filling your dmesg, but I am
>> just interested in the EC read/write bytes just before and around the
>> "1 stale EC events cleared" message. Grab this out of dmesg before it
>> fills with other stuff.
>>
>> This will tell us what command we are being asked to run. If you can,
>> please do it a few times to see if it's the same command each time or
>> something different.
>>
>> You can turn the debug output off again with:
>>
>> echo -n 'file ec.c -p' | sudo tee /sys/kernel/debug/dynamic_debug/control
>>
>> I might also need a copy of your DSDT, if you can send me that
>> separately in another email (not to the list):
>>
>> cat /sys/firmware/acpi/tables/DSDT > .DSDT.aml
>>
>> Thank you,
>> Kieran.
>>
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stefan Biereigel March 25, 2014, 4:38 p.m. UTC | #7
Hello Kieran,

thanks for the input. I use /proc/acpi/button/lid/LID0/state as an
indicator - whenever I resume from sleep on a bad kernel, state is
"closed" afterwards, until i close and re-open the lid.

I will build a kernel with dynamic debug enabled and grab the output if
it is of any help. Until then - my DSDT is uploaded here:
http://filebin.ca/1GkDaYm5U1lp/dsdt-samsung-n150

Stefan

Am 25.03.2014 14:23, schrieb Kieran Clancy:
> On Tue, Mar 25, 2014 at 8:04 PM, Lan Tianyu <lantianyu1986@gmail.com> wrote:
>> 2014-03-24 19:19 GMT+08:00 Stefan Biereigel <security@biereigel-wb.de>:
>>> Hi,
>>> thank you for the suggestion. The patch resolves the issue on my N150
>>> when applied to a clean 3.14-rc7. Anyways I'm wondering if similar
>>> problems to mine now exist on the Samsung Series 7/9 notebooks?
>>>
>>> Is any further action from my part required?
>>
>> Do you have these machines? If yes, please provide the output of
>> dmidecode command.
>>
>> Cc guys  of commit ad332c8a.
> 
> Hi guys,
> 
> That's a surprising side-effect, although I guess I shouldn't be
> surprised by Samsung ACPI weirdness anymore.
> 
> If we can, I'd like to get to the bottom of this rather than just turn
> off this fix (which we know works for series 5, 7 and 9 without
> problems).
> 
>>>> 2014-03-24 15:50 GMT+08:00 Stefan Biereigel <security@biereigel-wb.de>:
>>>>> Hi,
>>>>>
>>>>> starting with 3.14-rc6, the lid on my Samsung N150 behaves weird: My
>>>>> system is set up, so that it should suspend to RAM as soon as the lid is
>>>>> closed. Beginning with 3.14-rc6, the lid goes from "open" to "closed"
>>>>> correctly the first time (and the system suspends), but after resuming
>>>>> from standby (by opening the lid), the lid does not change to "open" again.
>>>>> Of course, closing the lid again does not induce suspend to RAM then.
>>>>> Opening the lid now (while not sleeping), makes ACPI notify the opening,
>>>>> so I guess ACPI "misses" or discards the lid open event from the EC when
>>>>> coming from sleep.
>>>>> Now, closing the lid again does induce suspend to RAM. This behaviour is
>>>>> reproducible: every other time, suspending works.
>>>>>
>>>>> This behaviour seems to be introduced by commit ad332c8a: ACPI / EC:
>>>>> Clear stale EC events on Samsung systems.
>>>>> Which was introduced after 3.14-rc5.
>>>>>
>>>>> When opening the lid to resume from standby, i see in dmesg:
>>>>> Mar 23 22:12:04 little1 kernel: [ 7630.932074] ACPI : EC: 1 stale EC
>>>>> events cleared
>>>>> (which comes from drivers/acpi/ec.c)
>>>>>
>>>>> Seems to me, that the "open" event is cleared from the EC, but also
>>>>> discarded instead of passed on. Shouldn't the correct behaviour be to
>>>>> report all the pending events, read from the EC, as ACPI events? Can you
>>>>> point me in a direction for fixing the issue cleanly, then I will try to
>>>>> find a solution and prepare a patch for this issue.
> 
> Stefan, thank you for reporting this issue.
> 
> Our rationale for discarding the events was that events queued during
> sleep are probably no longer relevant. There could also be other
> unwanted side-effects of blindly executing all of the old
> instructions. But in your case, this assumption might be wrong.
> 
> What command are you using to check if the lid is "open" or "closed"?
> Is it because the screen is not waking up, or some other effect, or
> just because it won't suspend again when it's re-closed?
> 
> Do other events like AC plug/unplug affect any of this if you do them
> during this bad state?
> 
> I'd like to see exactly which EC command byte is being thrown away
> here. If you do something like this (with dynamic debug enabled)
> 
> echo -n 'file ec.c +p' | sudo tee /sys/kernel/debug/dynamic_debug/control
> 
> You should get massively verbose EC stuff filling your dmesg, but I am
> just interested in the EC read/write bytes just before and around the
> "1 stale EC events cleared" message. Grab this out of dmesg before it
> fills with other stuff.
> 
> This will tell us what command we are being asked to run. If you can,
> please do it a few times to see if it's the same command each time or
> something different.
> 
> You can turn the debug output off again with:
> 
> echo -n 'file ec.c -p' | sudo tee /sys/kernel/debug/dynamic_debug/control
> 
> I might also need a copy of your DSDT, if you can send me that
> separately in another email (not to the list):
> 
> cat /sys/firmware/acpi/tables/DSDT > .DSDT.aml
> 
> Thank you,
> Kieran.
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stefan Biereigel March 25, 2014, 8:24 p.m. UTC | #8
Alright - as I have some spare time tonight, here is the dmesg of one
sleep/resume cycle on an unpatched 3.14-rc8 tree with my usual config +
dynamic debug enabled. After resuming, lid state was still "closed", so
the bug was triggered (and the stale event is removed as stated).

http://pastebin.ca/2679209

You can find the "EC: 1 stale EC events cleared" Line @ Line 316. I
really did not expect that much traffic from the EC, interesting to get
some insight into that!

Regarding other actions: The system notices removal of the AC cord when
in sleep (xfce4-power-manager reports the correct symbol after the
wakeup), so I guess this is working as it should (as it is handled by
the _WAK-Routine??).

Hope that helps!
Stefan


Am 25.03.2014 14:23, schrieb Kieran Clancy:
> On Tue, Mar 25, 2014 at 8:04 PM, Lan Tianyu <lantianyu1986@gmail.com> wrote:
>> 2014-03-24 19:19 GMT+08:00 Stefan Biereigel <security@biereigel-wb.de>:
>>> Hi,
>>> thank you for the suggestion. The patch resolves the issue on my N150
>>> when applied to a clean 3.14-rc7. Anyways I'm wondering if similar
>>> problems to mine now exist on the Samsung Series 7/9 notebooks?
>>>
>>> Is any further action from my part required?
>>
>> Do you have these machines? If yes, please provide the output of
>> dmidecode command.
>>
>> Cc guys  of commit ad332c8a.
> 
> Hi guys,
> 
> That's a surprising side-effect, although I guess I shouldn't be
> surprised by Samsung ACPI weirdness anymore.
> 
> If we can, I'd like to get to the bottom of this rather than just turn
> off this fix (which we know works for series 5, 7 and 9 without
> problems).
> 
>>>> 2014-03-24 15:50 GMT+08:00 Stefan Biereigel <security@biereigel-wb.de>:
>>>>> Hi,
>>>>>
>>>>> starting with 3.14-rc6, the lid on my Samsung N150 behaves weird: My
>>>>> system is set up, so that it should suspend to RAM as soon as the lid is
>>>>> closed. Beginning with 3.14-rc6, the lid goes from "open" to "closed"
>>>>> correctly the first time (and the system suspends), but after resuming
>>>>> from standby (by opening the lid), the lid does not change to "open" again.
>>>>> Of course, closing the lid again does not induce suspend to RAM then.
>>>>> Opening the lid now (while not sleeping), makes ACPI notify the opening,
>>>>> so I guess ACPI "misses" or discards the lid open event from the EC when
>>>>> coming from sleep.
>>>>> Now, closing the lid again does induce suspend to RAM. This behaviour is
>>>>> reproducible: every other time, suspending works.
>>>>>
>>>>> This behaviour seems to be introduced by commit ad332c8a: ACPI / EC:
>>>>> Clear stale EC events on Samsung systems.
>>>>> Which was introduced after 3.14-rc5.
>>>>>
>>>>> When opening the lid to resume from standby, i see in dmesg:
>>>>> Mar 23 22:12:04 little1 kernel: [ 7630.932074] ACPI : EC: 1 stale EC
>>>>> events cleared
>>>>> (which comes from drivers/acpi/ec.c)
>>>>>
>>>>> Seems to me, that the "open" event is cleared from the EC, but also
>>>>> discarded instead of passed on. Shouldn't the correct behaviour be to
>>>>> report all the pending events, read from the EC, as ACPI events? Can you
>>>>> point me in a direction for fixing the issue cleanly, then I will try to
>>>>> find a solution and prepare a patch for this issue.
> 
> Stefan, thank you for reporting this issue.
> 
> Our rationale for discarding the events was that events queued during
> sleep are probably no longer relevant. There could also be other
> unwanted side-effects of blindly executing all of the old
> instructions. But in your case, this assumption might be wrong.
> 
> What command are you using to check if the lid is "open" or "closed"?
> Is it because the screen is not waking up, or some other effect, or
> just because it won't suspend again when it's re-closed?
> 
> Do other events like AC plug/unplug affect any of this if you do them
> during this bad state?
> 
> I'd like to see exactly which EC command byte is being thrown away
> here. If you do something like this (with dynamic debug enabled)
> 
> echo -n 'file ec.c +p' | sudo tee /sys/kernel/debug/dynamic_debug/control
> 
> You should get massively verbose EC stuff filling your dmesg, but I am
> just interested in the EC read/write bytes just before and around the
> "1 stale EC events cleared" message. Grab this out of dmesg before it
> fills with other stuff.
> 
> This will tell us what command we are being asked to run. If you can,
> please do it a few times to see if it's the same command each time or
> something different.
> 
> You can turn the debug output off again with:
> 
> echo -n 'file ec.c -p' | sudo tee /sys/kernel/debug/dynamic_debug/control
> 
> I might also need a copy of your DSDT, if you can send me that
> separately in another email (not to the list):
> 
> cat /sys/firmware/acpi/tables/DSDT > .DSDT.aml
> 
> Thank you,
> Kieran.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Juan Manuel Cabo March 25, 2014, 8:41 p.m. UTC | #9
I can see a LID event being discarded here on resume:

      [  728.861983] ACPI : EC: <--- command = 0x84
      [  728.864062] ACPI : EC: ---> status = 0x09
      [  728.864070] ACPI : EC: ---> data = 0x5f
      [  728.864075] ACPI : EC: <--- command = 0x84
      [  728.868054] ACPI : EC: ---> status = 0x09
      [  728.868061] ACPI : EC: ---> data = 0x00
      [  728.868066] ACPI : EC: 1 stale EC events cleared
      [  728.868079] ACPI: Waking up from system sleep state S3

the 0x5F event means it's a LID change event (both in your dsdt and mine, look for: Q5F)

But there is still something I'm not sure of. So I will test that RC8 kernel tonight on my
laptop and let you all know. If RC8 still works on my laptop, then I guess the best would be
to make the EC fix specific to series 5, 7, 9 laptops, so your N150 is not affected anymore.
Thank you so much Stefan for your in-depth tests!

Cheers!
--
Juan Manuel Cabo



On 03/25/2014 05:24 PM, Stefan Biereigel wrote:
> Alright - as I have some spare time tonight, here is the dmesg of one
> sleep/resume cycle on an unpatched 3.14-rc8 tree with my usual config +
> dynamic debug enabled. After resuming, lid state was still "closed", so
> the bug was triggered (and the stale event is removed as stated).
>
> http://pastebin.ca/2679209
>
> You can find the "EC: 1 stale EC events cleared" Line @ Line 316. I
> really did not expect that much traffic from the EC, interesting to get
> some insight into that!
>
> Regarding other actions: The system notices removal of the AC cord when
> in sleep (xfce4-power-manager reports the correct symbol after the
> wakeup), so I guess this is working as it should (as it is handled by
> the _WAK-Routine??).
>
> Hope that helps!
> Stefan
>
>
> Am 25.03.2014 14:23, schrieb Kieran Clancy:
>> On Tue, Mar 25, 2014 at 8:04 PM, Lan Tianyu <lantianyu1986@gmail.com> wrote:
>>> 2014-03-24 19:19 GMT+08:00 Stefan Biereigel <security@biereigel-wb.de>:
>>>> Hi,
>>>> thank you for the suggestion. The patch resolves the issue on my N150
>>>> when applied to a clean 3.14-rc7. Anyways I'm wondering if similar
>>>> problems to mine now exist on the Samsung Series 7/9 notebooks?
>>>>
>>>> Is any further action from my part required?
>>> Do you have these machines? If yes, please provide the output of
>>> dmidecode command.
>>>
>>> Cc guys  of commit ad332c8a.
>> Hi guys,
>>
>> That's a surprising side-effect, although I guess I shouldn't be
>> surprised by Samsung ACPI weirdness anymore.
>>
>> If we can, I'd like to get to the bottom of this rather than just turn
>> off this fix (which we know works for series 5, 7 and 9 without
>> problems).
>>
>>>>> 2014-03-24 15:50 GMT+08:00 Stefan Biereigel <security@biereigel-wb.de>:
>>>>>> Hi,
>>>>>>
>>>>>> starting with 3.14-rc6, the lid on my Samsung N150 behaves weird: My
>>>>>> system is set up, so that it should suspend to RAM as soon as the lid is
>>>>>> closed. Beginning with 3.14-rc6, the lid goes from "open" to "closed"
>>>>>> correctly the first time (and the system suspends), but after resuming
>>>>>> from standby (by opening the lid), the lid does not change to "open" again.
>>>>>> Of course, closing the lid again does not induce suspend to RAM then.
>>>>>> Opening the lid now (while not sleeping), makes ACPI notify the opening,
>>>>>> so I guess ACPI "misses" or discards the lid open event from the EC when
>>>>>> coming from sleep.
>>>>>> Now, closing the lid again does induce suspend to RAM. This behaviour is
>>>>>> reproducible: every other time, suspending works.
>>>>>>
>>>>>> This behaviour seems to be introduced by commit ad332c8a: ACPI / EC:
>>>>>> Clear stale EC events on Samsung systems.
>>>>>> Which was introduced after 3.14-rc5.
>>>>>>
>>>>>> When opening the lid to resume from standby, i see in dmesg:
>>>>>> Mar 23 22:12:04 little1 kernel: [ 7630.932074] ACPI : EC: 1 stale EC
>>>>>> events cleared
>>>>>> (which comes from drivers/acpi/ec.c)
>>>>>>
>>>>>> Seems to me, that the "open" event is cleared from the EC, but also
>>>>>> discarded instead of passed on. Shouldn't the correct behaviour be to
>>>>>> report all the pending events, read from the EC, as ACPI events? Can you
>>>>>> point me in a direction for fixing the issue cleanly, then I will try to
>>>>>> find a solution and prepare a patch for this issue.
>> Stefan, thank you for reporting this issue.
>>
>> Our rationale for discarding the events was that events queued during
>> sleep are probably no longer relevant. There could also be other
>> unwanted side-effects of blindly executing all of the old
>> instructions. But in your case, this assumption might be wrong.
>>
>> What command are you using to check if the lid is "open" or "closed"?
>> Is it because the screen is not waking up, or some other effect, or
>> just because it won't suspend again when it's re-closed?
>>
>> Do other events like AC plug/unplug affect any of this if you do them
>> during this bad state?
>>
>> I'd like to see exactly which EC command byte is being thrown away
>> here. If you do something like this (with dynamic debug enabled)
>>
>> echo -n 'file ec.c +p' | sudo tee /sys/kernel/debug/dynamic_debug/control
>>
>> You should get massively verbose EC stuff filling your dmesg, but I am
>> just interested in the EC read/write bytes just before and around the
>> "1 stale EC events cleared" message. Grab this out of dmesg before it
>> fills with other stuff.
>>
>> This will tell us what command we are being asked to run. If you can,
>> please do it a few times to see if it's the same command each time or
>> something different.
>>
>> You can turn the debug output off again with:
>>
>> echo -n 'file ec.c -p' | sudo tee /sys/kernel/debug/dynamic_debug/control
>>
>> I might also need a copy of your DSDT, if you can send me that
>> separately in another email (not to the list):
>>
>> cat /sys/firmware/acpi/tables/DSDT > .DSDT.aml
>>
>> Thank you,
>> Kieran.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>>

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Juan Manuel Cabo March 25, 2014, 10:56 p.m. UTC | #10
Just tested with 3.13.7 which contains the EC fix, and I get the same
discarded event as your pastbin shows, but instead my LID status is
correctly reported as "open" always.

So now I'm sure it boils down to your _WAK method not rechecking the LID state (mine
does, I have a Samsung Series 5 NP530U3C).
    If it did recheck the LID state and issue, then it wouldn't have mattered whether
an accumulated LID event was discarded.


So I see two possible fixes:
    1) Modify the EC fix so that it doesn't discard events and instead processes them.
or
    2) Restrict the models to which the EC fix applies, as in Lan Tianyu patch, but
        it needs to be broather.


Lan Tianyu patch contains just these DMI_PRODUCT_NAME matches:    

            530U3BI/530U4BI/530U4BH
            530U3C/530U4C/532U3C

These two should at least be added:
            530U3C/530U4C           (reported by my laptop)
            535U3C                         (reported by D. G. Jansen laptop)

and also, at least all the ones needed for:

    Samsung Series 5 (models NP530U3C, NP535U3C, NP530U3B, NP550P5C)
    Samsung Series 9 (models NP900X3F, NP900X4B, NP900X4C, NP900X4D, NP900X3C)


I hope this helped,

Cheers!
--
Juan Manuel Cabo



On 03/25/2014 05:41 PM, Juan Manuel Cabo wrote:
> I can see a LID event being discarded here on resume:
>
>       [  728.861983] ACPI : EC: <--- command = 0x84
>       [  728.864062] ACPI : EC: ---> status = 0x09
>       [  728.864070] ACPI : EC: ---> data = 0x5f
>       [  728.864075] ACPI : EC: <--- command = 0x84
>       [  728.868054] ACPI : EC: ---> status = 0x09
>       [  728.868061] ACPI : EC: ---> data = 0x00
>       [  728.868066] ACPI : EC: 1 stale EC events cleared
>       [  728.868079] ACPI: Waking up from system sleep state S3
>
> the 0x5F event means it's a LID change event (both in your dsdt and mine, look for: Q5F)
>
> But there is still something I'm not sure of. So I will test that RC8 kernel tonight on my
> laptop and let you all know. If RC8 still works on my laptop, then I guess the best would be
> to make the EC fix specific to series 5, 7, 9 laptops, so your N150 is not affected anymore.
> Thank you so much Stefan for your in-depth tests!
>
> Cheers!
> --
> Juan Manuel Cabo
>
>
>
> On 03/25/2014 05:24 PM, Stefan Biereigel wrote:
>> Alright - as I have some spare time tonight, here is the dmesg of one
>> sleep/resume cycle on an unpatched 3.14-rc8 tree with my usual config +
>> dynamic debug enabled. After resuming, lid state was still "closed", so
>> the bug was triggered (and the stale event is removed as stated).
>>
>> http://pastebin.ca/2679209
>>
>> You can find the "EC: 1 stale EC events cleared" Line @ Line 316. I
>> really did not expect that much traffic from the EC, interesting to get
>> some insight into that!
>>
>> Regarding other actions: The system notices removal of the AC cord when
>> in sleep (xfce4-power-manager reports the correct symbol after the
>> wakeup), so I guess this is working as it should (as it is handled by
>> the _WAK-Routine??).
>>
>> Hope that helps!
>> Stefan
>>
>>
>> Am 25.03.2014 14:23, schrieb Kieran Clancy:
>>> On Tue, Mar 25, 2014 at 8:04 PM, Lan Tianyu <lantianyu1986@gmail.com> wrote:
>>>> 2014-03-24 19:19 GMT+08:00 Stefan Biereigel <security@biereigel-wb.de>:
>>>>> Hi,
>>>>> thank you for the suggestion. The patch resolves the issue on my N150
>>>>> when applied to a clean 3.14-rc7. Anyways I'm wondering if similar
>>>>> problems to mine now exist on the Samsung Series 7/9 notebooks?
>>>>>
>>>>> Is any further action from my part required?
>>>> Do you have these machines? If yes, please provide the output of
>>>> dmidecode command.
>>>>
>>>> Cc guys  of commit ad332c8a.
>>> Hi guys,
>>>
>>> That's a surprising side-effect, although I guess I shouldn't be
>>> surprised by Samsung ACPI weirdness anymore.
>>>
>>> If we can, I'd like to get to the bottom of this rather than just turn
>>> off this fix (which we know works for series 5, 7 and 9 without
>>> problems).
>>>
>>>>>> 2014-03-24 15:50 GMT+08:00 Stefan Biereigel <security@biereigel-wb.de>:
>>>>>>> Hi,
>>>>>>>
>>>>>>> starting with 3.14-rc6, the lid on my Samsung N150 behaves weird: My
>>>>>>> system is set up, so that it should suspend to RAM as soon as the lid is
>>>>>>> closed. Beginning with 3.14-rc6, the lid goes from "open" to "closed"
>>>>>>> correctly the first time (and the system suspends), but after resuming
>>>>>>> from standby (by opening the lid), the lid does not change to "open" again.
>>>>>>> Of course, closing the lid again does not induce suspend to RAM then.
>>>>>>> Opening the lid now (while not sleeping), makes ACPI notify the opening,
>>>>>>> so I guess ACPI "misses" or discards the lid open event from the EC when
>>>>>>> coming from sleep.
>>>>>>> Now, closing the lid again does induce suspend to RAM. This behaviour is
>>>>>>> reproducible: every other time, suspending works.
>>>>>>>
>>>>>>> This behaviour seems to be introduced by commit ad332c8a: ACPI / EC:
>>>>>>> Clear stale EC events on Samsung systems.
>>>>>>> Which was introduced after 3.14-rc5.
>>>>>>>
>>>>>>> When opening the lid to resume from standby, i see in dmesg:
>>>>>>> Mar 23 22:12:04 little1 kernel: [ 7630.932074] ACPI : EC: 1 stale EC
>>>>>>> events cleared
>>>>>>> (which comes from drivers/acpi/ec.c)
>>>>>>>
>>>>>>> Seems to me, that the "open" event is cleared from the EC, but also
>>>>>>> discarded instead of passed on. Shouldn't the correct behaviour be to
>>>>>>> report all the pending events, read from the EC, as ACPI events? Can you
>>>>>>> point me in a direction for fixing the issue cleanly, then I will try to
>>>>>>> find a solution and prepare a patch for this issue.
>>> Stefan, thank you for reporting this issue.
>>>
>>> Our rationale for discarding the events was that events queued during
>>> sleep are probably no longer relevant. There could also be other
>>> unwanted side-effects of blindly executing all of the old
>>> instructions. But in your case, this assumption might be wrong.
>>>
>>> What command are you using to check if the lid is "open" or "closed"?
>>> Is it because the screen is not waking up, or some other effect, or
>>> just because it won't suspend again when it's re-closed?
>>>
>>> Do other events like AC plug/unplug affect any of this if you do them
>>> during this bad state?
>>>
>>> I'd like to see exactly which EC command byte is being thrown away
>>> here. If you do something like this (with dynamic debug enabled)
>>>
>>> echo -n 'file ec.c +p' | sudo tee /sys/kernel/debug/dynamic_debug/control
>>>
>>> You should get massively verbose EC stuff filling your dmesg, but I am
>>> just interested in the EC read/write bytes just before and around the
>>> "1 stale EC events cleared" message. Grab this out of dmesg before it
>>> fills with other stuff.
>>>
>>> This will tell us what command we are being asked to run. If you can,
>>> please do it a few times to see if it's the same command each time or
>>> something different.
>>>
>>> You can turn the debug output off again with:
>>>
>>> echo -n 'file ec.c -p' | sudo tee /sys/kernel/debug/dynamic_debug/control
>>>
>>> I might also need a copy of your DSDT, if you can send me that
>>> separately in another email (not to the list):
>>>
>>> cat /sys/firmware/acpi/tables/DSDT > .DSDT.aml
>>>
>>> Thank you,
>>> Kieran.
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> Please read the FAQ at  http://www.tux.org/lkml/
>>>

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stefan Biereigel March 26, 2014, 10:42 a.m. UTC | #11
I don't know if it is a valid idea, but maybe it would be ok to process
events after resume in general, and only throw away events on those
platforms that continue to log events while in standby (Samsung 5/7/9)?

But after all, it would be better to find the command to tell the EC to
stop recording events and issue that before going to sleep. Is there any
way to find that command (for example on Windows)? Maybe that one is so
generic that we can issue it to all ECs, regardless of if it would
continue to log events during sleep.

Stefan

Am 25.03.2014 23:56, schrieb Juan Manuel Cabo:
> Just tested with 3.13.7 which contains the EC fix, and I get the same
> discarded event as your pastbin shows, but instead my LID status is
> correctly reported as "open" always.
> 
> So now I'm sure it boils down to your _WAK method not rechecking the LID state (mine
> does, I have a Samsung Series 5 NP530U3C).
>     If it did recheck the LID state and issue, then it wouldn't have mattered whether
> an accumulated LID event was discarded.
> 
> 
> So I see two possible fixes:
>     1) Modify the EC fix so that it doesn't discard events and instead processes them.
> or
>     2) Restrict the models to which the EC fix applies, as in Lan Tianyu patch, but
>         it needs to be broather.
> 
> 
> Lan Tianyu patch contains just these DMI_PRODUCT_NAME matches:    
> 
>             530U3BI/530U4BI/530U4BH
>             530U3C/530U4C/532U3C
> 
> These two should at least be added:
>             530U3C/530U4C           (reported by my laptop)
>             535U3C                         (reported by D. G. Jansen laptop)
> 
> and also, at least all the ones needed for:
> 
>     Samsung Series 5 (models NP530U3C, NP535U3C, NP530U3B, NP550P5C)
>     Samsung Series 9 (models NP900X3F, NP900X4B, NP900X4C, NP900X4D, NP900X3C)
> 
> 
> I hope this helped,
> 
> Cheers!
> --
> Juan Manuel Cabo
> 
> 
> 
> On 03/25/2014 05:41 PM, Juan Manuel Cabo wrote:
>> I can see a LID event being discarded here on resume:
>>
>>       [  728.861983] ACPI : EC: <--- command = 0x84
>>       [  728.864062] ACPI : EC: ---> status = 0x09
>>       [  728.864070] ACPI : EC: ---> data = 0x5f
>>       [  728.864075] ACPI : EC: <--- command = 0x84
>>       [  728.868054] ACPI : EC: ---> status = 0x09
>>       [  728.868061] ACPI : EC: ---> data = 0x00
>>       [  728.868066] ACPI : EC: 1 stale EC events cleared
>>       [  728.868079] ACPI: Waking up from system sleep state S3
>>
>> the 0x5F event means it's a LID change event (both in your dsdt and mine, look for: Q5F)
>>
>> But there is still something I'm not sure of. So I will test that RC8 kernel tonight on my
>> laptop and let you all know. If RC8 still works on my laptop, then I guess the best would be
>> to make the EC fix specific to series 5, 7, 9 laptops, so your N150 is not affected anymore.
>> Thank you so much Stefan for your in-depth tests!
>>
>> Cheers!
>> --
>> Juan Manuel Cabo
>>
>>
>>
>> On 03/25/2014 05:24 PM, Stefan Biereigel wrote:
>>> Alright - as I have some spare time tonight, here is the dmesg of one
>>> sleep/resume cycle on an unpatched 3.14-rc8 tree with my usual config +
>>> dynamic debug enabled. After resuming, lid state was still "closed", so
>>> the bug was triggered (and the stale event is removed as stated).
>>>
>>> http://pastebin.ca/2679209
>>>
>>> You can find the "EC: 1 stale EC events cleared" Line @ Line 316. I
>>> really did not expect that much traffic from the EC, interesting to get
>>> some insight into that!
>>>
>>> Regarding other actions: The system notices removal of the AC cord when
>>> in sleep (xfce4-power-manager reports the correct symbol after the
>>> wakeup), so I guess this is working as it should (as it is handled by
>>> the _WAK-Routine??).
>>>
>>> Hope that helps!
>>> Stefan
>>>
>>>
>>> Am 25.03.2014 14:23, schrieb Kieran Clancy:
>>>> On Tue, Mar 25, 2014 at 8:04 PM, Lan Tianyu <lantianyu1986@gmail.com> wrote:
>>>>> 2014-03-24 19:19 GMT+08:00 Stefan Biereigel <security@biereigel-wb.de>:
>>>>>> Hi,
>>>>>> thank you for the suggestion. The patch resolves the issue on my N150
>>>>>> when applied to a clean 3.14-rc7. Anyways I'm wondering if similar
>>>>>> problems to mine now exist on the Samsung Series 7/9 notebooks?
>>>>>>
>>>>>> Is any further action from my part required?
>>>>> Do you have these machines? If yes, please provide the output of
>>>>> dmidecode command.
>>>>>
>>>>> Cc guys  of commit ad332c8a.
>>>> Hi guys,
>>>>
>>>> That's a surprising side-effect, although I guess I shouldn't be
>>>> surprised by Samsung ACPI weirdness anymore.
>>>>
>>>> If we can, I'd like to get to the bottom of this rather than just turn
>>>> off this fix (which we know works for series 5, 7 and 9 without
>>>> problems).
>>>>
>>>>>>> 2014-03-24 15:50 GMT+08:00 Stefan Biereigel <security@biereigel-wb.de>:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> starting with 3.14-rc6, the lid on my Samsung N150 behaves weird: My
>>>>>>>> system is set up, so that it should suspend to RAM as soon as the lid is
>>>>>>>> closed. Beginning with 3.14-rc6, the lid goes from "open" to "closed"
>>>>>>>> correctly the first time (and the system suspends), but after resuming
>>>>>>>> from standby (by opening the lid), the lid does not change to "open" again.
>>>>>>>> Of course, closing the lid again does not induce suspend to RAM then.
>>>>>>>> Opening the lid now (while not sleeping), makes ACPI notify the opening,
>>>>>>>> so I guess ACPI "misses" or discards the lid open event from the EC when
>>>>>>>> coming from sleep.
>>>>>>>> Now, closing the lid again does induce suspend to RAM. This behaviour is
>>>>>>>> reproducible: every other time, suspending works.
>>>>>>>>
>>>>>>>> This behaviour seems to be introduced by commit ad332c8a: ACPI / EC:
>>>>>>>> Clear stale EC events on Samsung systems.
>>>>>>>> Which was introduced after 3.14-rc5.
>>>>>>>>
>>>>>>>> When opening the lid to resume from standby, i see in dmesg:
>>>>>>>> Mar 23 22:12:04 little1 kernel: [ 7630.932074] ACPI : EC: 1 stale EC
>>>>>>>> events cleared
>>>>>>>> (which comes from drivers/acpi/ec.c)
>>>>>>>>
>>>>>>>> Seems to me, that the "open" event is cleared from the EC, but also
>>>>>>>> discarded instead of passed on. Shouldn't the correct behaviour be to
>>>>>>>> report all the pending events, read from the EC, as ACPI events? Can you
>>>>>>>> point me in a direction for fixing the issue cleanly, then I will try to
>>>>>>>> find a solution and prepare a patch for this issue.
>>>> Stefan, thank you for reporting this issue.
>>>>
>>>> Our rationale for discarding the events was that events queued during
>>>> sleep are probably no longer relevant. There could also be other
>>>> unwanted side-effects of blindly executing all of the old
>>>> instructions. But in your case, this assumption might be wrong.
>>>>
>>>> What command are you using to check if the lid is "open" or "closed"?
>>>> Is it because the screen is not waking up, or some other effect, or
>>>> just because it won't suspend again when it's re-closed?
>>>>
>>>> Do other events like AC plug/unplug affect any of this if you do them
>>>> during this bad state?
>>>>
>>>> I'd like to see exactly which EC command byte is being thrown away
>>>> here. If you do something like this (with dynamic debug enabled)
>>>>
>>>> echo -n 'file ec.c +p' | sudo tee /sys/kernel/debug/dynamic_debug/control
>>>>
>>>> You should get massively verbose EC stuff filling your dmesg, but I am
>>>> just interested in the EC read/write bytes just before and around the
>>>> "1 stale EC events cleared" message. Grab this out of dmesg before it
>>>> fills with other stuff.
>>>>
>>>> This will tell us what command we are being asked to run. If you can,
>>>> please do it a few times to see if it's the same command each time or
>>>> something different.
>>>>
>>>> You can turn the debug output off again with:
>>>>
>>>> echo -n 'file ec.c -p' | sudo tee /sys/kernel/debug/dynamic_debug/control
>>>>
>>>> I might also need a copy of your DSDT, if you can send me that
>>>> separately in another email (not to the list):
>>>>
>>>> cat /sys/firmware/acpi/tables/DSDT > .DSDT.aml
>>>>
>>>> Thank you,
>>>> Kieran.
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>> Please read the FAQ at  http://www.tux.org/lkml/
>>>>
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kieran Clancy March 26, 2014, 2:38 p.m. UTC | #12
On Wed, Mar 26, 2014 at 9:12 PM, Stefan Biereigel <stefan@biereigel.de> wrote:
>
> I don't know if it is a valid idea, but maybe it would be ok to process
> events after resume in general, and only throw away events on those
> platforms that continue to log events while in standby (Samsung 5/7/9)?

We can give it a shot!

It may even be that on Samsung 5/7/9 there is no harm in processing
the events instead of discarding them.

Here's a patch that we can try for that:

http://kieranclancy.com/tmp/2014/03/ec_clear_process.patch

I'm going to have to clear up some hard drive space before I can
compile a kernel with it (curse this tiny SSD), but if anyone else
wants to give it a shot please feel free.

One thing I am still trying to think about is that Stefan's system
must be about to check and process this EC event anyway, but we
discard it first. I wonder if there is some difference in state (e.g.
EC status bits) here that we might detect.

Failing all that, it might be easier to just whitelist this product
rather than try to find every series 5/7/9/?/... Samsung product name.
Something like this:

http://kieranclancy.com/tmp/2014/03/ec_clear_exclude_N150P.patch (not
intended for use with other patch, they are mutually exclusive
approaches)

> But after all, it would be better to find the command to tell the EC to
> stop recording events and issue that before going to sleep. Is there any
> way to find that command (for example on Windows)? Maybe that one is so
> generic that we can issue it to all ECs, regardless of if it would
> continue to log events during sleep.

It would be great to know what Windows does here, but even then we
still need to be able to clear a jammed EC. You can still find posts
around the internet where Windows users who've never touched Linux
have these same kind of problems with their Samsung laptops. My guess
is it only takes one blue screen of death or something where the EC
isn't shut down properly, the EC fills up and then it's more or less a
permanent state until the EC is cleared manually.

Thanks again for your testing, Stefan.

Kieran.
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki March 26, 2014, 3:01 p.m. UTC | #13
On Thursday, March 27, 2014 01:08:58 AM Kieran Clancy wrote:
> On Wed, Mar 26, 2014 at 9:12 PM, Stefan Biereigel <stefan@biereigel.de> wrote:
> >
> > I don't know if it is a valid idea, but maybe it would be ok to process
> > events after resume in general, and only throw away events on those
> > platforms that continue to log events while in standby (Samsung 5/7/9)?
> 
> We can give it a shot!
> 
> It may even be that on Samsung 5/7/9 there is no harm in processing
> the events instead of discarding them.
> 
> Here's a patch that we can try for that:
> 
> http://kieranclancy.com/tmp/2014/03/ec_clear_process.patch

Can you please send patches inline so that it's more convenient to
comment them if need be?
Stefan Biereigel March 26, 2014, 7:56 p.m. UTC | #14
Hi,

I tested both of your patches. The processing of events works well on my
N150, the lid is reported open correctly after resume.
For the second patch (the whitelisting-approach), I had to change the
Product Name to "N150/N210/N220" instead of "N150P", because that is
what dmidecode reports for my netbook.

So, all three approaches work equally well for me (whitelisting my
broken N150, blacklisting the broken Series 5/7/9, processing all the
stale events). I personally would prefer a solution which needs to
handle (best case) no custom cases, because there are always n+1 of
them. But, as I don't know if there may be any problems with the
approach that needs no special handling (processing all stale events) in
the future, I'm not the one to decide the correct solution.

What would you do? Do any of these patches create issues for you?

Greetings
Stefan

Am 26.03.2014 15:38, schrieb Kieran Clancy:
> On Wed, Mar 26, 2014 at 9:12 PM, Stefan Biereigel <stefan@biereigel.de> wrote:
>>
>> I don't know if it is a valid idea, but maybe it would be ok to process
>> events after resume in general, and only throw away events on those
>> platforms that continue to log events while in standby (Samsung 5/7/9)?
> 
> We can give it a shot!
> 
> It may even be that on Samsung 5/7/9 there is no harm in processing
> the events instead of discarding them.
> 
> Here's a patch that we can try for that:
> 
> http://kieranclancy.com/tmp/2014/03/ec_clear_process.patch
> 
> I'm going to have to clear up some hard drive space before I can
> compile a kernel with it (curse this tiny SSD), but if anyone else
> wants to give it a shot please feel free.
> 
> One thing I am still trying to think about is that Stefan's system
> must be about to check and process this EC event anyway, but we
> discard it first. I wonder if there is some difference in state (e.g.
> EC status bits) here that we might detect.
> 
> Failing all that, it might be easier to just whitelist this product
> rather than try to find every series 5/7/9/?/... Samsung product name.
> Something like this:
> 
> http://kieranclancy.com/tmp/2014/03/ec_clear_exclude_N150P.patch (not
> intended for use with other patch, they are mutually exclusive
> approaches)
> 
>> But after all, it would be better to find the command to tell the EC to
>> stop recording events and issue that before going to sleep. Is there any
>> way to find that command (for example on Windows)? Maybe that one is so
>> generic that we can issue it to all ECs, regardless of if it would
>> continue to log events during sleep.
> 
> It would be great to know what Windows does here, but even then we
> still need to be able to clear a jammed EC. You can still find posts
> around the internet where Windows users who've never touched Linux
> have these same kind of problems with their Samsung laptops. My guess
> is it only takes one blue screen of death or something where the EC
> isn't shut down properly, the EC fills up and then it's more or less a
> permanent state until the EC is cleared manually.
> 
> Thanks again for your testing, Stefan.
> 
> Kieran.
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kieran Clancy March 26, 2014, 10:36 p.m. UTC | #15
On Thu, Mar 27, 2014 at 6:26 AM, Stefan Biereigel <stefan@biereigel.de> wrote:
> I tested both of your patches. The processing of events works well on my
> N150, the lid is reported open correctly after resume.
> For the second patch (the whitelisting-approach), I had to change the
> Product Name to "N150/N210/N220" instead of "N150P", because that is
> what dmidecode reports for my netbook.

That was quick - thanks for testing!

For the product name match then, it matches substrings not whole
strings, so "N150" should be sufficient (my mistake putting P on the
end).

> So, all three approaches work equally well for me (whitelisting my
> broken N150, blacklisting the broken Series 5/7/9, processing all the
> stale events). I personally would prefer a solution which needs to
> handle (best case) no custom cases, because there are always n+1 of
> them. But, as I don't know if there may be any problems with the
> approach that needs no special handling (processing all stale events) in
> the future, I'm not the one to decide the correct solution.

I won't be able to test ec_clear_process patch until tomorrow because
I have a full day today.

On my machine, _QXX events are all something like:

if (AC_PLUGGED_IN) {
    do_something();
}

So if (for example) AC_PLUGGED_IN has changed since the event was
produced (e.g. no longer plugged in), nothing bad should happen.
That's not necessarily a guarantee that this wouldn't introduce new
bugs on other machines though.

I think the ideal fix would be to distinguish between events which are
"jammed" and won't be processed (like on Series 5/7/9), and events
which will be processed normally with GPEs (N150). I am not sure how
to do this or if it's even possible.

For example, on my machine, the EC status byte (EC_SC) seems to be
0x29 for jammed events, which means the SCI_EVT bit is set but we
never got/get the interrupt. On your N150, your status byte was 0x09
which means the SCI_EVT was not set - it was not yet asking for the OS
to attend to this.

I wonder if something as simple as this would work (in acpi_ec_clear):

                if (!(acpi_ec_read_status(ec) & ACPI_EC_FLAG_SCI))
                        break;
                status = acpi_ec_query_unlocked(ec, &value);
                if (status || !value)
                        break;

This would make it only clear events while the SCI_EVT bit is set. I
am not sure that it would entirely remove the race condition you are
seeing, but it might be enough to fix it.

If we cant come up with a generally applicable solution, whitelisting
is the lesser of two evils when compared with blacklisting here. A
jammed EC won't function _at all_, while losing one or two EC events
on boot/resume doesn't prevent future events and is easier to work
around (though still not ideal).

Regards,
Kieran.
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stefan Biereigel March 26, 2014, 10:41 p.m. UTC | #16
Am 26.03.2014 23:36, schrieb Kieran Clancy:
> On Thu, Mar 27, 2014 at 6:26 AM, Stefan Biereigel <stefan@biereigel.de> wrote:
>> I tested both of your patches. The processing of events works well on my
>> N150, the lid is reported open correctly after resume.
>> For the second patch (the whitelisting-approach), I had to change the
>> Product Name to "N150/N210/N220" instead of "N150P", because that is
>> what dmidecode reports for my netbook.
> That was quick - thanks for testing!
>
> For the product name match then, it matches substrings not whole
> strings, so "N150" should be sufficient (my mistake putting P on the
> end).
Alright, so that was the problem why it did not work with your original 
patch, but I missed that it did substring matching, so "N150" should be 
ok there.
>
>> So, all three approaches work equally well for me (whitelisting my
>> broken N150, blacklisting the broken Series 5/7/9, processing all the
>> stale events). I personally would prefer a solution which needs to
>> handle (best case) no custom cases, because there are always n+1 of
>> them. But, as I don't know if there may be any problems with the
>> approach that needs no special handling (processing all stale events) in
>> the future, I'm not the one to decide the correct solution.
> I won't be able to test ec_clear_process patch until tomorrow because
> I have a full day today.
>
> On my machine, _QXX events are all something like:
>
> if (AC_PLUGGED_IN) {
>      do_something();
> }
>
> So if (for example) AC_PLUGGED_IN has changed since the event was
> produced (e.g. no longer plugged in), nothing bad should happen.
> That's not necessarily a guarantee that this wouldn't introduce new
> bugs on other machines though.
>
> I think the ideal fix would be to distinguish between events which are
> "jammed" and won't be processed (like on Series 5/7/9), and events
> which will be processed normally with GPEs (N150). I am not sure how
> to do this or if it's even possible.
>
> For example, on my machine, the EC status byte (EC_SC) seems to be
> 0x29 for jammed events, which means the SCI_EVT bit is set but we
> never got/get the interrupt. On your N150, your status byte was 0x09
> which means the SCI_EVT was not set - it was not yet asking for the OS
> to attend to this.
>
> I wonder if something as simple as this would work (in acpi_ec_clear):
>
>                  if (!(acpi_ec_read_status(ec) & ACPI_EC_FLAG_SCI))
>                          break;
>                  status = acpi_ec_query_unlocked(ec, &value);
>                  if (status || !value)
>                          break;
>
> This would make it only clear events while the SCI_EVT bit is set. I
> am not sure that it would entirely remove the race condition you are
> seeing, but it might be enough to fix it.
>
> If we cant come up with a generally applicable solution, whitelisting
> is the lesser of two evils when compared with blacklisting here. A
> jammed EC won't function _at all_, while losing one or two EC events
> on boot/resume doesn't prevent future events and is easier to work
> around (though still not ideal).
You are right there, of course. Sadly, I can find nobody near me who 
owns one of the newer Samsung machines, therefore I can only contribute 
with testing on my machine. If there is anything else I could try, let 
me know.

Best wishes,
Stefan

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kieran Clancy April 1, 2014, 9:53 a.m. UTC | #17
On Sat, Mar 29, 2014 at 1:51 AM, D. G. Jansen <d.g.jansen@gmail.com> wrote:
>
> This patch (processing events) works for me on Samsung 535.

Hi Dennis and others,

Thank you for testing. I have also been testing this patch for a
couple of days, and have not encountered any problems.

I have now submitted a request for testing on the kernel bugzilla
entry for the bug fixed by the earlier patch in question:

https://bugzilla.kernel.org/show_bug.cgi?id=44161#c173

The CC list has 62 users, so hopefully we will get some more testers.
The comment I posted is repeated below:

--------------

Created attachment 131151 [details]
process rather than discard events in acpi_ec_clear

Hi all,

If you have a machine that was (previously) affected by this bug,
could you please take the time to try a new patch for this issue?

A user with an older Samsung machine found that the earlier patch
introduced some issues for them, so this is a new patch designed to be
a better solution to the problem.

Currently the new patch has only been tested on 2 affected machines
(to my knowledge), so I would be very grateful if anyone else can test
this patch (especially users with Samsung series 5 or 7 machines).

You will need to use a recent kernel source like 3. (one which
includes the earlier acpi_ec_clear patch), or otherwise apply the
previous patch first:

https://github.com/torvalds/linux/commit/ad332c8a45330d170bb38b95209de449b31cd1b4.patch

Then apply the attachment ec_process_stale_events.patch

Please test to see if your lid closing / suspending / etc. still work
as intended with this patch.

This new patch also provides some debug information after every
resume/boot which will show up in dmesg something like:

[ 1192.208056] ACPI : EC: acpi_ec_clear: EC_SC 0x28
[ 1192.217844] ACPI : EC: acpi_ec_clear: EC_SC 0x28
[ 1192.218844] ACPI : EC: acpi_ec_clear: EC_SC 0x28
[ 1192.222835] ACPI : EC: acpi_ec_clear: EC_SC 0x28
[ 1192.223834] ACPI : EC: acpi_ec_clear: EC_SC 0x28
[ 1192.224833] ACPI : EC: acpi_ec_clear: EC_SC 0x28
[ 1192.227833] ACPI : EC: acpi_ec_clear: EC_SC 0x28
[ 1192.228834] ACPI : EC: acpi_ec_clear: EC_SC 0x28
[ 1192.229832] ACPI : EC: acpi_ec_clear: EC_SC 0x28
[ 1192.230828] ACPI : EC: 8 stale EC events cleared

Please report if you get any EC_SC bytes here _other than_ 0x28. I
only get 0x28 on my machine but I am very interested to know if other
affected machines produce something else.

If you test it, please also report your system's product code, e.g.:

# dmidecode -s system-product-name
900X3F
# dmidecode -s baseboard-product-name
NP900X3F-K01AU

Many thanks,
Kieran
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicolas Porcel April 1, 2014, 11:36 a.m. UTC | #18
Hello,

Thank you for the patch. Since the update from 3.13.6 to 3.13.7, I have 
the same problem with the lid opening that is not detected on my Samsung 
N210. This patch fixes the problem.

I think this problem may affect more Samsung netbooks with similar 
hardware (N220, and maybe others).

Do you have any idea of how much time will it take to merge it to the 
mainline ?

Thank you

Regards

Nicolas Porcel

On 01/04/2014 17:53, Kieran Clancy wrote:
> On Sat, Mar 29, 2014 at 1:51 AM, D. G. Jansen <d.g.jansen@gmail.com> wrote:
>> This patch (processing events) works for me on Samsung 535.
> Hi Dennis and others,
>
> Thank you for testing. I have also been testing this patch for a
> couple of days, and have not encountered any problems.
>
> I have now submitted a request for testing on the kernel bugzilla
> entry for the bug fixed by the earlier patch in question:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=44161#c173
>
> The CC list has 62 users, so hopefully we will get some more testers.
> The comment I posted is repeated below:
>
> --------------
>
> Created attachment 131151 [details]
> process rather than discard events in acpi_ec_clear
>
> Hi all,
>
> If you have a machine that was (previously) affected by this bug,
> could you please take the time to try a new patch for this issue?
>
> A user with an older Samsung machine found that the earlier patch
> introduced some issues for them, so this is a new patch designed to be
> a better solution to the problem.
>
> Currently the new patch has only been tested on 2 affected machines
> (to my knowledge), so I would be very grateful if anyone else can test
> this patch (especially users with Samsung series 5 or 7 machines).
>
> You will need to use a recent kernel source like 3. (one which
> includes the earlier acpi_ec_clear patch), or otherwise apply the
> previous patch first:
>
> https://github.com/torvalds/linux/commit/ad332c8a45330d170bb38b95209de449b31cd1b4.patch
>
> Then apply the attachment ec_process_stale_events.patch
>
> Please test to see if your lid closing / suspending / etc. still work
> as intended with this patch.
>
> This new patch also provides some debug information after every
> resume/boot which will show up in dmesg something like:
>
> [ 1192.208056] ACPI : EC: acpi_ec_clear: EC_SC 0x28
> [ 1192.217844] ACPI : EC: acpi_ec_clear: EC_SC 0x28
> [ 1192.218844] ACPI : EC: acpi_ec_clear: EC_SC 0x28
> [ 1192.222835] ACPI : EC: acpi_ec_clear: EC_SC 0x28
> [ 1192.223834] ACPI : EC: acpi_ec_clear: EC_SC 0x28
> [ 1192.224833] ACPI : EC: acpi_ec_clear: EC_SC 0x28
> [ 1192.227833] ACPI : EC: acpi_ec_clear: EC_SC 0x28
> [ 1192.228834] ACPI : EC: acpi_ec_clear: EC_SC 0x28
> [ 1192.229832] ACPI : EC: acpi_ec_clear: EC_SC 0x28
> [ 1192.230828] ACPI : EC: 8 stale EC events cleared
>
> Please report if you get any EC_SC bytes here _other than_ 0x28. I
> only get 0x28 on my machine but I am very interested to know if other
> affected machines produce something else.
>
> If you test it, please also report your system's product code, e.g.:
>
> # dmidecode -s system-product-name
> 900X3F
> # dmidecode -s baseboard-product-name
> NP900X3F-K01AU
>
> Many thanks,
> Kieran
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kieran Clancy April 1, 2014, 11:58 a.m. UTC | #19
On Tue, Apr 1, 2014 at 10:06 PM, Nicolas Porcel
<nicolasporcel06@gmail.com> wrote:
> Do you have any idea of how much time will it take to merge it to the
> mainline ?

Apologies for the breakage. The key here is that we want a patch which
works 100% on all Samsung systems. If we rush a fix for this
regression, we may risk breaking things on other systems.

Fortunately, the work-around in the case of this regression is very
simple (just repeat whatever event was ignored the first time, be it
lid open, ac plug, etc) compared to the bug which the original patch
fixed (which had no real workaround and caused near-permanent
malfunction).

This is why I am asking people with Samsung systems to test the patch
(with the debug output I added in the kernel bugzilla attachment) and
confirm that it is working. I am also asking them to report EC_SC data
because we may be able to create a more targeted patch with this
information.

Speaking of which, can you please let me know the EC_SC bytes you see
when you test it on your system, and if it is always the same?

Once I hear back from users on a range of Samsung systems, I will
submit a patch. It should have enough time to land before 3.15. I will
also ask for it to be included in other stable branches.

Kind regards,
Kieran
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicolas Porcel April 1, 2014, 12:18 p.m. UTC | #20
Sure, I understand that you need to test it, that's why I am asking. I 
don't want an untested patch to be included in the kernel either. We 
also have to make sure that the patch fix the regression for all the 
affected Samsung laptops.

For the EC_SC code, I am using the first version of the patch without 
the pr_info line, so I need to recompile the kernel. I'll let you know 
as soon as it's done.

Best regards,

Nicolas Porcel

On 01/04/2014 19:58, Kieran Clancy wrote:
> On Tue, Apr 1, 2014 at 10:06 PM, Nicolas Porcel
> <nicolasporcel06@gmail.com> wrote:
>> Do you have any idea of how much time will it take to merge it to the
>> mainline ?
> Apologies for the breakage. The key here is that we want a patch which
> works 100% on all Samsung systems. If we rush a fix for this
> regression, we may risk breaking things on other systems.
>
> Fortunately, the work-around in the case of this regression is very
> simple (just repeat whatever event was ignored the first time, be it
> lid open, ac plug, etc) compared to the bug which the original patch
> fixed (which had no real workaround and caused near-permanent
> malfunction).
>
> This is why I am asking people with Samsung systems to test the patch
> (with the debug output I added in the kernel bugzilla attachment) and
> confirm that it is working. I am also asking them to report EC_SC data
> because we may be able to create a more targeted patch with this
> information.
>
> Speaking of which, can you please let me know the EC_SC bytes you see
> when you test it on your system, and if it is always the same?
>
> Once I hear back from users on a range of Samsung systems, I will
> submit a patch. It should have enough time to land before 3.15. I will
> also ask for it to be included in other stable branches.
>
> Kind regards,
> Kieran

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicolas Porcel April 1, 2014, 6:02 p.m. UTC | #21
I managed to recompile the kernel. Here is what I see in the logs:
kernel: ACPI : EC: acpi_ec_clear: EC_SC 0x20
kernel: ACPI : EC: acpi_ec_clear: EC_SC 0x08
kernel: ACPI : EC: 1 stale EC events cleared

I hope this will help

Regards,

Nicolas Porcel

Le 01/04/2014 20:18, Nicolas Porcel a écrit :
> Sure, I understand that you need to test it, that's why I am asking. I 
> don't want an untested patch to be included in the kernel either. We 
> also have to make sure that the patch fix the regression for all the 
> affected Samsung laptops.
>
> For the EC_SC code, I am using the first version of the patch without 
> the pr_info line, so I need to recompile the kernel. I'll let you know 
> as soon as it's done.
>
> Best regards,
>
> Nicolas Porcel
>
> On 01/04/2014 19:58, Kieran Clancy wrote:
>> On Tue, Apr 1, 2014 at 10:06 PM, Nicolas Porcel
>> <nicolasporcel06@gmail.com> wrote:
>>> Do you have any idea of how much time will it take to merge it to the
>>> mainline ?
>> Apologies for the breakage. The key here is that we want a patch which
>> works 100% on all Samsung systems. If we rush a fix for this
>> regression, we may risk breaking things on other systems.
>>
>> Fortunately, the work-around in the case of this regression is very
>> simple (just repeat whatever event was ignored the first time, be it
>> lid open, ac plug, etc) compared to the bug which the original patch
>> fixed (which had no real workaround and caused near-permanent
>> malfunction).
>>
>> This is why I am asking people with Samsung systems to test the patch
>> (with the debug output I added in the kernel bugzilla attachment) and
>> confirm that it is working. I am also asking them to report EC_SC data
>> because we may be able to create a more targeted patch with this
>> information.
>>
>> Speaking of which, can you please let me know the EC_SC bytes you see
>> when you test it on your system, and if it is always the same?
>>
>> Once I hear back from users on a range of Samsung systems, I will
>> submit a patch. It should have enough time to land before 3.15. I will
>> also ask for it to be included in other stable branches.
>>
>> Kind regards,
>> Kieran
>

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
index d7d32c2..9239527 100644
--- a/drivers/acpi/ec.c
+++ b/drivers/acpi/ec.c
@@ -1027,8 +1027,13 @@  static struct dmi_system_id ec_dmi_table[] __initdata = {
        DMI_MATCH(DMI_SYS_VENDOR, "ASUSTek Computer Inc."),
        DMI_MATCH(DMI_PRODUCT_NAME, "L4R"),}, NULL},
        {
-       ec_clear_on_resume, "Samsung hardware", {
-       DMI_MATCH(DMI_SYS_VENDOR, "SAMSUNG ELECTRONICS CO., LTD.")}, NULL},
+       ec_clear_on_resume, "Samsung NP530U3B", {
+       DMI_MATCH(DMI_SYS_VENDOR, "SAMSUNG ELECTRONICS CO., LTD."),
+       DMI_MATCH(DMI_PRODUCT_NAME, "530U3BI/530U4BI/530U4BH"),}, NULL},
+       {
+       ec_clear_on_resume, "Samsung NP530U3C", {
+       DMI_MATCH(DMI_SYS_VENDOR, "SAMSUNG ELECTRONICS CO., LTD."),
+       DMI_MATCH(DMI_PRODUCT_NAME, "530U3C/530U4C/532U3C"),}, NULL},
        {},
 };