diff mbox

Connection problems with Qualcomm Atheros QCA6174

Message ID CA+BoTQk9znROEy4o958DVuBkv1cpY6eG=-wCof1fahuh93PObw@mail.gmail.com (mailing list archive)
State Not Applicable
Delegated to: Kalle Valo
Headers show

Commit Message

Michal Kazior Nov. 14, 2016, 1:13 p.m. UTC
On 13 November 2016 at 06:57, Henrý Þór Baldursson
<henry.baldursson@gmail.com> wrote:
> Hello
>
> I have a Lenovo Ideapad Yoga 910 running Athergos, kernel version 4.8.7.
>
> My problem is that intermittently my wifi will just grind to a halt
> and even stop working.
>
> The driver reports the firmware as being WLAN.RM.2.0-00180-QCARMSWPZ-1
>
> Here's the initialization in dmesg:
>
> [    1.900033] ath10k_pci 0000:01:00.0: pci irq msi oper_irq_mode 2
> irq_mode 0 reset_mode 0
> [    2.173039] ath10k_pci 0000:01:00.0: Direct firmware load for
> ath10k/pre-cal-pci-0000:01:00.0.bin failed with error -2
> [    2.173054] ath10k_pci 0000:01:00.0: Direct firmware load for
> ath10k/cal-pci-0000:01:00.0.bin failed with error -2
> [    2.174217] ath10k_pci 0000:01:00.0: Direct firmware load for
> ath10k/QCA6174/hw3.0/firmware-5.bin failed with error -2
> [    2.174218] ath10k_pci 0000:01:00.0: could not fetch firmware file
> 'ath10k/QCA6174/hw3.0/firmware-5.bin': -2
> [    2.175485] ath10k_pci 0000:01:00.0: qca6174 hw3.2 target
> 0x05030000 chip_id 0x00340aff sub 17aa:0827
> [    2.175487] ath10k_pci 0000:01:00.0: kconfig debug 0 debugfs 1
> tracing 0 dfs 0 testmode 0
> [    2.175876] ath10k_pci 0000:01:00.0: firmware ver
> WLAN.RM.2.0-00180-QCARMSWPZ-1 api 4 features
> wowlan,ignore-otp,no-4addr-pad crc32 75dee6c5
> [    2.240312] ath10k_pci 0000:01:00.0: board_file api 2 bmi_id N/A
> crc32 6fc88fe7
> [    4.405741] ath10k_pci 0000:01:00.0: htt-ver 3.26 wmi-op 4 htt-op 3
> cal otp max-sta 32 raw 0 hwcrypto 1
> [    4.485362] ath10k_pci 0000:01:00.0 wlp1s0: renamed from wlan0
>
>
> Here's what dmesg shows during the errors:
>
> [ 1238.710899] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
> [ 1238.710920] pcieport 0000:00:1c.0: PCIe Bus Error:
> severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
> [ 1238.710935] pcieport 0000:00:1c.0:   device [8086:9d14] error
> status/mask=00001000/00000000
> [ 1238.710945] pcieport 0000:00:1c.0:    [12] Replay Timer Timeout
> [ 1243.855456] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
> [ 1243.855472] pcieport 0000:00:1c.0: PCIe Bus Error:
> severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
> [ 1243.855484] pcieport 0000:00:1c.0:   device [8086:9d14] error
> status/mask=00001000/00000000
> [ 1243.855491] pcieport 0000:00:1c.0:    [12] Replay Timer Timeout
> [ 1296.095437] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
> [ 1296.095452] pcieport 0000:00:1c.0: PCIe Bus Error:
> severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
> [ 1296.095464] pcieport 0000:00:1c.0:   device [8086:9d14] error
> status/mask=00001000/00000000
> [ 1296.095472] pcieport 0000:00:1c.0:    [12] Replay Timer Timeout
> [ 1305.877547] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
> [ 1305.877562] pcieport 0000:00:1c.0: PCIe Bus Error:
> severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
> [ 1305.877574] pcieport 0000:00:1c.0:   device [8086:9d14] error
> status/mask=00001000/00000000
> [ 1305.877581] pcieport 0000:00:1c.0:    [12] Replay Timer Timeout
> [ 1556.596092] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
> [ 1556.596107] pcieport 0000:00:1c.0: PCIe Bus Error:
> severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
> [ 1556.596115] pcieport 0000:00:1c.0:   device [8086:9d14] error
> status/mask=00001000/00000000
> [ 1556.596122] pcieport 0000:00:1c.0:    [12] Replay Timer Timeout
> [ 1612.764108] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
> [ 1612.764124] pcieport 0000:00:1c.0: PCIe Bus Error:
> severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
> [ 1612.764135] pcieport 0000:00:1c.0:   device [8086:9d14] error
> status/mask=00001000/00000000
> [ 1612.764143] pcieport 0000:00:1c.0:    [12] Replay Timer Timeout

Can you try to override ar_pci->pci_ps to false in ath10k_pci_probe()
and see if it helps? Something along the diff:


Another thing which could be happening is ACPI S0ix states which - to
the best of my knowledge - Linux does not support. I've seen at least
i915 not being able to wake-up DSI displays properly when S0ix states
are enabled on some 10 inch 2-in-1 devices. I wouldn't be surprised if
other PCI-E (such as wifi/network devices) can be affected adversely
as well.

Therefore, can you check UEFI/BIOS (you may need to enable "expert
settings") if there are any mentions about S0x or S0ix ACPI modes
there and - if found - try disabling them and checking back if you
still get AER/bus errors, please?


Michał

Comments

Henrý Þór Baldursson Nov. 14, 2016, 8:56 p.m. UTC | #1
Hi

Thanks for your reply. :-)

The BIOS didn't display options related to S0x or S0ix ACPI modes. But
then, it's a Lenovo BIOS and they disable alot of features and don't
offer any expert mode.

I applied your patch, and will see if it keeps my network from failing.


It's been a while since I've done this, is this a problem?:
Nov 14 20:47:58 tempest kernel: ath10k_core: loading out-of-tree
module taints kernel.



- Henry.

On Mon, Nov 14, 2016 at 1:13 PM, Michal Kazior <michal.kazior@tieto.com> wrote:
> On 13 November 2016 at 06:57, Henrý Þór Baldursson
> <henry.baldursson@gmail.com> wrote:
>> Hello
>>
>> I have a Lenovo Ideapad Yoga 910 running Athergos, kernel version 4.8.7.
>>
>> My problem is that intermittently my wifi will just grind to a halt
>> and even stop working.
>>
>> The driver reports the firmware as being WLAN.RM.2.0-00180-QCARMSWPZ-1
>>
>> Here's the initialization in dmesg:
>>
>> [    1.900033] ath10k_pci 0000:01:00.0: pci irq msi oper_irq_mode 2
>> irq_mode 0 reset_mode 0
>> [    2.173039] ath10k_pci 0000:01:00.0: Direct firmware load for
>> ath10k/pre-cal-pci-0000:01:00.0.bin failed with error -2
>> [    2.173054] ath10k_pci 0000:01:00.0: Direct firmware load for
>> ath10k/cal-pci-0000:01:00.0.bin failed with error -2
>> [    2.174217] ath10k_pci 0000:01:00.0: Direct firmware load for
>> ath10k/QCA6174/hw3.0/firmware-5.bin failed with error -2
>> [    2.174218] ath10k_pci 0000:01:00.0: could not fetch firmware file
>> 'ath10k/QCA6174/hw3.0/firmware-5.bin': -2
>> [    2.175485] ath10k_pci 0000:01:00.0: qca6174 hw3.2 target
>> 0x05030000 chip_id 0x00340aff sub 17aa:0827
>> [    2.175487] ath10k_pci 0000:01:00.0: kconfig debug 0 debugfs 1
>> tracing 0 dfs 0 testmode 0
>> [    2.175876] ath10k_pci 0000:01:00.0: firmware ver
>> WLAN.RM.2.0-00180-QCARMSWPZ-1 api 4 features
>> wowlan,ignore-otp,no-4addr-pad crc32 75dee6c5
>> [    2.240312] ath10k_pci 0000:01:00.0: board_file api 2 bmi_id N/A
>> crc32 6fc88fe7
>> [    4.405741] ath10k_pci 0000:01:00.0: htt-ver 3.26 wmi-op 4 htt-op 3
>> cal otp max-sta 32 raw 0 hwcrypto 1
>> [    4.485362] ath10k_pci 0000:01:00.0 wlp1s0: renamed from wlan0
>>
>>
>> Here's what dmesg shows during the errors:
>>
>> [ 1238.710899] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
>> [ 1238.710920] pcieport 0000:00:1c.0: PCIe Bus Error:
>> severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
>> [ 1238.710935] pcieport 0000:00:1c.0:   device [8086:9d14] error
>> status/mask=00001000/00000000
>> [ 1238.710945] pcieport 0000:00:1c.0:    [12] Replay Timer Timeout
>> [ 1243.855456] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
>> [ 1243.855472] pcieport 0000:00:1c.0: PCIe Bus Error:
>> severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
>> [ 1243.855484] pcieport 0000:00:1c.0:   device [8086:9d14] error
>> status/mask=00001000/00000000
>> [ 1243.855491] pcieport 0000:00:1c.0:    [12] Replay Timer Timeout
>> [ 1296.095437] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
>> [ 1296.095452] pcieport 0000:00:1c.0: PCIe Bus Error:
>> severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
>> [ 1296.095464] pcieport 0000:00:1c.0:   device [8086:9d14] error
>> status/mask=00001000/00000000
>> [ 1296.095472] pcieport 0000:00:1c.0:    [12] Replay Timer Timeout
>> [ 1305.877547] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
>> [ 1305.877562] pcieport 0000:00:1c.0: PCIe Bus Error:
>> severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
>> [ 1305.877574] pcieport 0000:00:1c.0:   device [8086:9d14] error
>> status/mask=00001000/00000000
>> [ 1305.877581] pcieport 0000:00:1c.0:    [12] Replay Timer Timeout
>> [ 1556.596092] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
>> [ 1556.596107] pcieport 0000:00:1c.0: PCIe Bus Error:
>> severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
>> [ 1556.596115] pcieport 0000:00:1c.0:   device [8086:9d14] error
>> status/mask=00001000/00000000
>> [ 1556.596122] pcieport 0000:00:1c.0:    [12] Replay Timer Timeout
>> [ 1612.764108] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
>> [ 1612.764124] pcieport 0000:00:1c.0: PCIe Bus Error:
>> severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
>> [ 1612.764135] pcieport 0000:00:1c.0:   device [8086:9d14] error
>> status/mask=00001000/00000000
>> [ 1612.764143] pcieport 0000:00:1c.0:    [12] Replay Timer Timeout
>
> Can you try to override ar_pci->pci_ps to false in ath10k_pci_probe()
> and see if it helps? Something along the diff:
>
> --- a/drivers/net/wireless/ath/ath10k/pci.c
> +++ b/drivers/net/wireless/ath/ath10k/pci.c
> @@ -3236,7 +3236,7 @@ static int ath10k_pci_probe(struct pci_dev *pdev,
>         ar_pci->dev = &pdev->dev;
>         ar_pci->ar = ar;
>         ar->dev_id = pci_dev->device;
> -       ar_pci->pci_ps = pci_ps;
> +       ar_pci->pci_ps = false;
>         ar_pci->bus_ops = &ath10k_pci_bus_ops;
>         ar_pci->pci_soft_reset = pci_soft_reset;
>         ar_pci->pci_hard_reset = pci_hard_reset;
>
> Another thing which could be happening is ACPI S0ix states which - to
> the best of my knowledge - Linux does not support. I've seen at least
> i915 not being able to wake-up DSI displays properly when S0ix states
> are enabled on some 10 inch 2-in-1 devices. I wouldn't be surprised if
> other PCI-E (such as wifi/network devices) can be affected adversely
> as well.
>
> Therefore, can you check UEFI/BIOS (you may need to enable "expert
> settings") if there are any mentions about S0x or S0ix ACPI modes
> there and - if found - try disabling them and checking back if you
> still get AER/bus errors, please?
>
>
> Michał
Henrý Þór Baldursson Nov. 16, 2016, 12:44 a.m. UTC | #2
Hello

It appears to still be happening. In fact each reboot my wifi works
for a few minutes and then craps out and I gotta cycle it. It's
definitely not showing the bus error I mentioned every time. So maybe
it's unrelated.

Is there a way for me to debug the driver, get it to be more verbose
or something?


Henry

On Mon, Nov 14, 2016 at 8:56 PM, Henrý Þór Baldursson
<henry.baldursson@gmail.com> wrote:
> Hi
>
> Thanks for your reply. :-)
>
> The BIOS didn't display options related to S0x or S0ix ACPI modes. But
> then, it's a Lenovo BIOS and they disable alot of features and don't
> offer any expert mode.
>
> I applied your patch, and will see if it keeps my network from failing.
>
>
> It's been a while since I've done this, is this a problem?:
> Nov 14 20:47:58 tempest kernel: ath10k_core: loading out-of-tree
> module taints kernel.
>
>
>
> - Henry.
>
> On Mon, Nov 14, 2016 at 1:13 PM, Michal Kazior <michal.kazior@tieto.com> wrote:
>> On 13 November 2016 at 06:57, Henrý Þór Baldursson
>> <henry.baldursson@gmail.com> wrote:
>>> Hello
>>>
>>> I have a Lenovo Ideapad Yoga 910 running Athergos, kernel version 4.8.7.
>>>
>>> My problem is that intermittently my wifi will just grind to a halt
>>> and even stop working.
>>>
>>> The driver reports the firmware as being WLAN.RM.2.0-00180-QCARMSWPZ-1
>>>
>>> Here's the initialization in dmesg:
>>>
>>> [    1.900033] ath10k_pci 0000:01:00.0: pci irq msi oper_irq_mode 2
>>> irq_mode 0 reset_mode 0
>>> [    2.173039] ath10k_pci 0000:01:00.0: Direct firmware load for
>>> ath10k/pre-cal-pci-0000:01:00.0.bin failed with error -2
>>> [    2.173054] ath10k_pci 0000:01:00.0: Direct firmware load for
>>> ath10k/cal-pci-0000:01:00.0.bin failed with error -2
>>> [    2.174217] ath10k_pci 0000:01:00.0: Direct firmware load for
>>> ath10k/QCA6174/hw3.0/firmware-5.bin failed with error -2
>>> [    2.174218] ath10k_pci 0000:01:00.0: could not fetch firmware file
>>> 'ath10k/QCA6174/hw3.0/firmware-5.bin': -2
>>> [    2.175485] ath10k_pci 0000:01:00.0: qca6174 hw3.2 target
>>> 0x05030000 chip_id 0x00340aff sub 17aa:0827
>>> [    2.175487] ath10k_pci 0000:01:00.0: kconfig debug 0 debugfs 1
>>> tracing 0 dfs 0 testmode 0
>>> [    2.175876] ath10k_pci 0000:01:00.0: firmware ver
>>> WLAN.RM.2.0-00180-QCARMSWPZ-1 api 4 features
>>> wowlan,ignore-otp,no-4addr-pad crc32 75dee6c5
>>> [    2.240312] ath10k_pci 0000:01:00.0: board_file api 2 bmi_id N/A
>>> crc32 6fc88fe7
>>> [    4.405741] ath10k_pci 0000:01:00.0: htt-ver 3.26 wmi-op 4 htt-op 3
>>> cal otp max-sta 32 raw 0 hwcrypto 1
>>> [    4.485362] ath10k_pci 0000:01:00.0 wlp1s0: renamed from wlan0
>>>
>>>
>>> Here's what dmesg shows during the errors:
>>>
>>> [ 1238.710899] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
>>> [ 1238.710920] pcieport 0000:00:1c.0: PCIe Bus Error:
>>> severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
>>> [ 1238.710935] pcieport 0000:00:1c.0:   device [8086:9d14] error
>>> status/mask=00001000/00000000
>>> [ 1238.710945] pcieport 0000:00:1c.0:    [12] Replay Timer Timeout
>>> [ 1243.855456] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
>>> [ 1243.855472] pcieport 0000:00:1c.0: PCIe Bus Error:
>>> severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
>>> [ 1243.855484] pcieport 0000:00:1c.0:   device [8086:9d14] error
>>> status/mask=00001000/00000000
>>> [ 1243.855491] pcieport 0000:00:1c.0:    [12] Replay Timer Timeout
>>> [ 1296.095437] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
>>> [ 1296.095452] pcieport 0000:00:1c.0: PCIe Bus Error:
>>> severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
>>> [ 1296.095464] pcieport 0000:00:1c.0:   device [8086:9d14] error
>>> status/mask=00001000/00000000
>>> [ 1296.095472] pcieport 0000:00:1c.0:    [12] Replay Timer Timeout
>>> [ 1305.877547] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
>>> [ 1305.877562] pcieport 0000:00:1c.0: PCIe Bus Error:
>>> severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
>>> [ 1305.877574] pcieport 0000:00:1c.0:   device [8086:9d14] error
>>> status/mask=00001000/00000000
>>> [ 1305.877581] pcieport 0000:00:1c.0:    [12] Replay Timer Timeout
>>> [ 1556.596092] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
>>> [ 1556.596107] pcieport 0000:00:1c.0: PCIe Bus Error:
>>> severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
>>> [ 1556.596115] pcieport 0000:00:1c.0:   device [8086:9d14] error
>>> status/mask=00001000/00000000
>>> [ 1556.596122] pcieport 0000:00:1c.0:    [12] Replay Timer Timeout
>>> [ 1612.764108] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
>>> [ 1612.764124] pcieport 0000:00:1c.0: PCIe Bus Error:
>>> severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
>>> [ 1612.764135] pcieport 0000:00:1c.0:   device [8086:9d14] error
>>> status/mask=00001000/00000000
>>> [ 1612.764143] pcieport 0000:00:1c.0:    [12] Replay Timer Timeout
>>
>> Can you try to override ar_pci->pci_ps to false in ath10k_pci_probe()
>> and see if it helps? Something along the diff:
>>
>> --- a/drivers/net/wireless/ath/ath10k/pci.c
>> +++ b/drivers/net/wireless/ath/ath10k/pci.c
>> @@ -3236,7 +3236,7 @@ static int ath10k_pci_probe(struct pci_dev *pdev,
>>         ar_pci->dev = &pdev->dev;
>>         ar_pci->ar = ar;
>>         ar->dev_id = pci_dev->device;
>> -       ar_pci->pci_ps = pci_ps;
>> +       ar_pci->pci_ps = false;
>>         ar_pci->bus_ops = &ath10k_pci_bus_ops;
>>         ar_pci->pci_soft_reset = pci_soft_reset;
>>         ar_pci->pci_hard_reset = pci_hard_reset;
>>
>> Another thing which could be happening is ACPI S0ix states which - to
>> the best of my knowledge - Linux does not support. I've seen at least
>> i915 not being able to wake-up DSI displays properly when S0ix states
>> are enabled on some 10 inch 2-in-1 devices. I wouldn't be surprised if
>> other PCI-E (such as wifi/network devices) can be affected adversely
>> as well.
>>
>> Therefore, can you check UEFI/BIOS (you may need to enable "expert
>> settings") if there are any mentions about S0x or S0ix ACPI modes
>> there and - if found - try disabling them and checking back if you
>> still get AER/bus errors, please?
>>
>>
>> Michał
diff mbox

Patch

--- a/drivers/net/wireless/ath/ath10k/pci.c
+++ b/drivers/net/wireless/ath/ath10k/pci.c
@@ -3236,7 +3236,7 @@  static int ath10k_pci_probe(struct pci_dev *pdev,
        ar_pci->dev = &pdev->dev;
        ar_pci->ar = ar;
        ar->dev_id = pci_dev->device;
-       ar_pci->pci_ps = pci_ps;
+       ar_pci->pci_ps = false;
        ar_pci->bus_ops = &ath10k_pci_bus_ops;
        ar_pci->pci_soft_reset = pci_soft_reset;
        ar_pci->pci_hard_reset = pci_hard_reset;