diff mbox series

wifi: ath11k: Add a warning for wcn6855 spurious wakeup events

Message ID 20230220213807.28523-1-mario.limonciello@amd.com (mailing list archive)
State Rejected
Delegated to: Kalle Valo
Headers show
Series wifi: ath11k: Add a warning for wcn6855 spurious wakeup events | expand

Commit Message

Mario Limonciello Feb. 20, 2023, 9:38 p.m. UTC
When WCN6855 firmware versions less than 0x110B196E are used with
an AMD APU and the user puts the system into s2idle spurious wakeup
events can occur. These are difficult to attribute to the WLAN F/W
so add a warning to the kernel driver to give users a hint where
to look.

This was tested on WCN6855 and a Lenovo Z13 with the following
firmware versions:
WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.9
WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.23

Link: http://lists.infradead.org/pipermail/ath11k/2023-February/004024.html
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2377
Link: https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/2006458
Link: https://lore.kernel.org/linux-gpio/20221012221028.4817-1-mario.limonciello@amd.com/
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
 drivers/net/wireless/ath/ath11k/pci.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

Comments

Kalle Valo Feb. 27, 2023, 12:36 p.m. UTC | #1
Mario Limonciello <mario.limonciello@amd.com> writes:

> When WCN6855 firmware versions less than 0x110B196E are used with
> an AMD APU and the user puts the system into s2idle spurious wakeup
> events can occur. These are difficult to attribute to the WLAN F/W
> so add a warning to the kernel driver to give users a hint where
> to look.
>
> This was tested on WCN6855 and a Lenovo Z13 with the following
> firmware versions:
> WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.9
> WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.23
>
> Link: http://lists.infradead.org/pipermail/ath11k/2023-February/004024.html
> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2377
> Link: https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/2006458
> Link: https://lore.kernel.org/linux-gpio/20221012221028.4817-1-mario.limonciello@amd.com/
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>

[...]

> +static void ath11k_check_s2idle_bug(struct ath11k_base *ab)
> +{
> +	struct pci_dev *rdev;
> +
> +	if (pm_suspend_target_state != PM_SUSPEND_TO_IDLE)
> +		return;
> +
> +	if (ab->id.device != WCN6855_DEVICE_ID)
> +		return;
> +
> +	if (ab->qmi.target.fw_version >= WCN6855_S2IDLE_VER)
> +		return;
> +
> +	rdev = pci_get_domain_bus_and_slot(0, 0, PCI_DEVFN(0, 0));
> +	if (rdev->vendor == PCI_VENDOR_ID_AMD)
> +		ath11k_warn(ab, "fw_version 0x%x may cause spurious wakeups. Upgrade to 0x%x or later.",
> +			    ab->qmi.target.fw_version, WCN6855_S2IDLE_VER);

I understand the reasons for this warning but I don't really trust the
check 'ab->qmi.target.fw_version >= WCN6855_S2IDLE_VER'. I don't know
how the firmware team populates the fw_version so I'm worried that if we
ever switch to a different firmware branch (or similar) this warning
might all of sudden start triggering for the users.
Mario Limonciello Feb. 27, 2023, 1:07 p.m. UTC | #2
On 2/27/23 06:36, Kalle Valo wrote:
> Mario Limonciello <mario.limonciello@amd.com> writes:
> 
>> When WCN6855 firmware versions less than 0x110B196E are used with
>> an AMD APU and the user puts the system into s2idle spurious wakeup
>> events can occur. These are difficult to attribute to the WLAN F/W
>> so add a warning to the kernel driver to give users a hint where
>> to look.
>>
>> This was tested on WCN6855 and a Lenovo Z13 with the following
>> firmware versions:
>> WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.9
>> WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.23
>>
>> Link: http://lists.infradead.org/pipermail/ath11k/2023-February/004024.html
>> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2377
>> Link: https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/2006458
>> Link: https://lore.kernel.org/linux-gpio/20221012221028.4817-1-mario.limonciello@amd.com/
>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> 
> [...]
> 
>> +static void ath11k_check_s2idle_bug(struct ath11k_base *ab)
>> +{
>> +	struct pci_dev *rdev;
>> +
>> +	if (pm_suspend_target_state != PM_SUSPEND_TO_IDLE)
>> +		return;
>> +
>> +	if (ab->id.device != WCN6855_DEVICE_ID)
>> +		return;
>> +
>> +	if (ab->qmi.target.fw_version >= WCN6855_S2IDLE_VER)
>> +		return;
>> +
>> +	rdev = pci_get_domain_bus_and_slot(0, 0, PCI_DEVFN(0, 0));
>> +	if (rdev->vendor == PCI_VENDOR_ID_AMD)
>> +		ath11k_warn(ab, "fw_version 0x%x may cause spurious wakeups. Upgrade to 0x%x or later.",
>> +			    ab->qmi.target.fw_version, WCN6855_S2IDLE_VER);
> 
> I understand the reasons for this warning but I don't really trust the
> check 'ab->qmi.target.fw_version >= WCN6855_S2IDLE_VER'. I don't know
> how the firmware team populates the fw_version so I'm worried that if we
> ever switch to a different firmware branch (or similar) this warning
> might all of sudden start triggering for the users.
> 

In that case, maybe would it be better to just have a list of the public 
firmware with issue and ensure it doesn't match one of those?
Kalle Valo Feb. 27, 2023, 1:14 p.m. UTC | #3
Mario Limonciello <mario.limonciello@amd.com> writes:

> On 2/27/23 06:36, Kalle Valo wrote:
>
>> Mario Limonciello <mario.limonciello@amd.com> writes:
>>
>>> When WCN6855 firmware versions less than 0x110B196E are used with
>>> an AMD APU and the user puts the system into s2idle spurious wakeup
>>> events can occur. These are difficult to attribute to the WLAN F/W
>>> so add a warning to the kernel driver to give users a hint where
>>> to look.
>>>
>>> This was tested on WCN6855 and a Lenovo Z13 with the following
>>> firmware versions:
>>> WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.9
>>> WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.23
>>>
>>> Link: http://lists.infradead.org/pipermail/ath11k/2023-February/004024.html
>>> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2377
>>> Link: https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/2006458
>>> Link:
>>> https://lore.kernel.org/linux-gpio/20221012221028.4817-1-mario.limonciello@amd.com/
>>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>>
>> [...]
>>
>>> +static void ath11k_check_s2idle_bug(struct ath11k_base *ab)
>>> +{
>>> +	struct pci_dev *rdev;
>>> +
>>> +	if (pm_suspend_target_state != PM_SUSPEND_TO_IDLE)
>>> +		return;
>>> +
>>> +	if (ab->id.device != WCN6855_DEVICE_ID)
>>> +		return;
>>> +
>>> +	if (ab->qmi.target.fw_version >= WCN6855_S2IDLE_VER)
>>> +		return;
>>> +
>>> +	rdev = pci_get_domain_bus_and_slot(0, 0, PCI_DEVFN(0, 0));
>>> +	if (rdev->vendor == PCI_VENDOR_ID_AMD)
>>> + ath11k_warn(ab, "fw_version 0x%x may cause spurious wakeups.
>>> Upgrade to 0x%x or later.",
>>> +			    ab->qmi.target.fw_version, WCN6855_S2IDLE_VER);
>>
>> I understand the reasons for this warning but I don't really trust the
>> check 'ab->qmi.target.fw_version >= WCN6855_S2IDLE_VER'. I don't know
>> how the firmware team populates the fw_version so I'm worried that if we
>> ever switch to a different firmware branch (or similar) this warning
>> might all of sudden start triggering for the users.
>>
>
> In that case, maybe would it be better to just have a list of the
> public firmware with issue and ensure it doesn't match one of those?

You mean ath11k checking for known broken versions and reporting that?
We have so many different firmwares to support in ath11k, I'm not really
keen on adding tests for a specific version.

We have a list of known important bugs in the wiki:

https://wireless.wiki.kernel.org/en/users/drivers/ath11k#known_bugslimitations

What about adding the issue there, would that get more exposure to the
bug and hopefully the users would upgrade the firmware?
Mario Limonciello Feb. 27, 2023, 1:19 p.m. UTC | #4
On 2/27/23 07:14, Kalle Valo wrote:
> Mario Limonciello <mario.limonciello@amd.com> writes:
> 
>> On 2/27/23 06:36, Kalle Valo wrote:
>>
>>> Mario Limonciello <mario.limonciello@amd.com> writes:
>>>
>>>> When WCN6855 firmware versions less than 0x110B196E are used with
>>>> an AMD APU and the user puts the system into s2idle spurious wakeup
>>>> events can occur. These are difficult to attribute to the WLAN F/W
>>>> so add a warning to the kernel driver to give users a hint where
>>>> to look.
>>>>
>>>> This was tested on WCN6855 and a Lenovo Z13 with the following
>>>> firmware versions:
>>>> WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.9
>>>> WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.23
>>>>
>>>> Link: http://lists.infradead.org/pipermail/ath11k/2023-February/004024.html
>>>> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2377
>>>> Link: https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/2006458
>>>> Link:
>>>> https://lore.kernel.org/linux-gpio/20221012221028.4817-1-mario.limonciello@amd.com/
>>>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>>>
>>> [...]
>>>
>>>> +static void ath11k_check_s2idle_bug(struct ath11k_base *ab)
>>>> +{
>>>> +	struct pci_dev *rdev;
>>>> +
>>>> +	if (pm_suspend_target_state != PM_SUSPEND_TO_IDLE)
>>>> +		return;
>>>> +
>>>> +	if (ab->id.device != WCN6855_DEVICE_ID)
>>>> +		return;
>>>> +
>>>> +	if (ab->qmi.target.fw_version >= WCN6855_S2IDLE_VER)
>>>> +		return;
>>>> +
>>>> +	rdev = pci_get_domain_bus_and_slot(0, 0, PCI_DEVFN(0, 0));
>>>> +	if (rdev->vendor == PCI_VENDOR_ID_AMD)
>>>> + ath11k_warn(ab, "fw_version 0x%x may cause spurious wakeups.
>>>> Upgrade to 0x%x or later.",
>>>> +			    ab->qmi.target.fw_version, WCN6855_S2IDLE_VER);
>>>
>>> I understand the reasons for this warning but I don't really trust the
>>> check 'ab->qmi.target.fw_version >= WCN6855_S2IDLE_VER'. I don't know
>>> how the firmware team populates the fw_version so I'm worried that if we
>>> ever switch to a different firmware branch (or similar) this warning
>>> might all of sudden start triggering for the users.
>>>
>>
>> In that case, maybe would it be better to just have a list of the
>> public firmware with issue and ensure it doesn't match one of those?
> 
> You mean ath11k checking for known broken versions and reporting that?
> We have so many different firmwares to support in ath11k, I'm not really
> keen on adding tests for a specific version.

I checked and only found a total of 7 firmware versions published for 
WCN6855 at your ath11k-firmware repo.  I'm not sure how many went to 
linux-firmware.  But it seems like a relatively small list to have.

> 
> We have a list of known important bugs in the wiki:
> 
> https://wireless.wiki.kernel.org/en/users/drivers/ath11k#known_bugslimitations
> 
> What about adding the issue there, would that get more exposure to the
> bug and hopefully the users would upgrade the firmware?
> 

The problem is when this happens users have no way to know it's even 
caused by wireless.  So why would they go looking at the wireless wiki?

The GPIO used for WLAN is different from design to design so we can't 
put it in the GPIO driver.  There are plenty of designs that have valid 
reasons to wakeup from other GPIOs as well so it can't just be the GPIO 
driver IRQ.
Kalle Valo April 5, 2023, 10:27 a.m. UTC | #5
Mario Limonciello <mario.limonciello@amd.com> writes:

> On 2/27/23 07:14, Kalle Valo wrote:
>
>> Mario Limonciello <mario.limonciello@amd.com> writes:
>>
>>> On 2/27/23 06:36, Kalle Valo wrote:
>>>
>>>> Mario Limonciello <mario.limonciello@amd.com> writes:
>>>>
>>>>> +static void ath11k_check_s2idle_bug(struct ath11k_base *ab)
>>>>> +{
>>>>> +	struct pci_dev *rdev;
>>>>> +
>>>>> +	if (pm_suspend_target_state != PM_SUSPEND_TO_IDLE)
>>>>> +		return;
>>>>> +
>>>>> +	if (ab->id.device != WCN6855_DEVICE_ID)
>>>>> +		return;
>>>>> +
>>>>> +	if (ab->qmi.target.fw_version >= WCN6855_S2IDLE_VER)
>>>>> +		return;
>>>>> +
>>>>> +	rdev = pci_get_domain_bus_and_slot(0, 0, PCI_DEVFN(0, 0));
>>>>> +	if (rdev->vendor == PCI_VENDOR_ID_AMD)
>>>>> + ath11k_warn(ab, "fw_version 0x%x may cause spurious wakeups.
>>>>> Upgrade to 0x%x or later.",
>>>>> +			    ab->qmi.target.fw_version, WCN6855_S2IDLE_VER);
>>>>
>>>> I understand the reasons for this warning but I don't really trust the
>>>> check 'ab->qmi.target.fw_version >= WCN6855_S2IDLE_VER'. I don't know
>>>> how the firmware team populates the fw_version so I'm worried that if we
>>>> ever switch to a different firmware branch (or similar) this warning
>>>> might all of sudden start triggering for the users.
>>>>
>>>
>>> In that case, maybe would it be better to just have a list of the
>>> public firmware with issue and ensure it doesn't match one of those?
>>
>> You mean ath11k checking for known broken versions and reporting that?
>> We have so many different firmwares to support in ath11k, I'm not really
>> keen on adding tests for a specific version.
>
> I checked and only found a total of 7 firmware versions published for
> WCN6855 at your ath11k-firmware repo.  I'm not sure how many went to
> linux-firmware.  But it seems like a relatively small list to have.

ath11k supports also other hardware families than just WCN6855, so there
are a lot of different firmware versions and branches.

>> We have a list of known important bugs in the wiki:
>>
>> https://wireless.wiki.kernel.org/en/users/drivers/ath11k#known_bugslimitations
>>
>> What about adding the issue there, would that get more exposure to the
>> bug and hopefully the users would upgrade the firmware?
>>
>
> The problem is when this happens users have no way to know it's even
> caused by wireless.  So why would they go looking at the wireless
> wiki?
>
> The GPIO used for WLAN is different from design to design so we can't
> put it in the GPIO driver.  There are plenty of designs that have
> valid reasons to wakeup from other GPIOs as well so it can't just be
> the GPIO driver IRQ.

I understand your problem but my problem is that I have three Qualcomm
drivers to support and that's a major challenge itself. So I try to keep
the drivers as simple as possible and avoid any hacks.
Mario Limonciello April 5, 2023, 8:47 p.m. UTC | #6
[Public]

> > On 2/27/23 07:14, Kalle Valo wrote:
> >
> >> Mario Limonciello <mario.limonciello@amd.com> writes:
> >>
> >>> On 2/27/23 06:36, Kalle Valo wrote:
> >>>
> >>>> Mario Limonciello <mario.limonciello@amd.com> writes:
> >>>>
> >>>>> +static void ath11k_check_s2idle_bug(struct ath11k_base *ab)
> >>>>> +{
> >>>>> +	struct pci_dev *rdev;
> >>>>> +
> >>>>> +	if (pm_suspend_target_state != PM_SUSPEND_TO_IDLE)
> >>>>> +		return;
> >>>>> +
> >>>>> +	if (ab->id.device != WCN6855_DEVICE_ID)
> >>>>> +		return;
> >>>>> +
> >>>>> +	if (ab->qmi.target.fw_version >= WCN6855_S2IDLE_VER)
> >>>>> +		return;
> >>>>> +
> >>>>> +	rdev = pci_get_domain_bus_and_slot(0, 0, PCI_DEVFN(0,
> 0));
> >>>>> +	if (rdev->vendor == PCI_VENDOR_ID_AMD)
> >>>>> + ath11k_warn(ab, "fw_version 0x%x may cause spurious wakeups.
> >>>>> Upgrade to 0x%x or later.",
> >>>>> +			    ab->qmi.target.fw_version,
> WCN6855_S2IDLE_VER);
> >>>>
> >>>> I understand the reasons for this warning but I don't really trust the
> >>>> check 'ab->qmi.target.fw_version >= WCN6855_S2IDLE_VER'. I don't
> know
> >>>> how the firmware team populates the fw_version so I'm worried that if
> we
> >>>> ever switch to a different firmware branch (or similar) this warning
> >>>> might all of sudden start triggering for the users.
> >>>>
> >>>
> >>> In that case, maybe would it be better to just have a list of the
> >>> public firmware with issue and ensure it doesn't match one of those?
> >>
> >> You mean ath11k checking for known broken versions and reporting that?
> >> We have so many different firmwares to support in ath11k, I'm not really
> >> keen on adding tests for a specific version.
> >
> > I checked and only found a total of 7 firmware versions published for
> > WCN6855 at your ath11k-firmware repo.  I'm not sure how many went to
> > linux-firmware.  But it seems like a relatively small list to have.
> 
> ath11k supports also other hardware families than just WCN6855, so there
> are a lot of different firmware versions and branches.

Right, but this change was explicitly checking the device ID matched WCN6855.

So it could be a single check for that device and any of the 5 bad firmware binaries.

> 
> >> We have a list of known important bugs in the wiki:
> >>
> >>
> https://wireless.wiki.kernel.org/en/users/drivers/ath11k#known_bugslimita
> tions
> >>
> >> What about adding the issue there, would that get more exposure to the
> >> bug and hopefully the users would upgrade the firmware?
> >>
> >
> > The problem is when this happens users have no way to know it's even
> > caused by wireless.  So why would they go looking at the wireless
> > wiki?
> >
> > The GPIO used for WLAN is different from design to design so we can't
> > put it in the GPIO driver.  There are plenty of designs that have
> > valid reasons to wakeup from other GPIOs as well so it can't just be
> > the GPIO driver IRQ.
> 
> I understand your problem but my problem is that I have three Qualcomm
> drivers to support and that's a major challenge itself. So I try to keep
> the drivers as simple as possible and avoid any hacks.

OK.
diff mbox series

Patch

diff --git a/drivers/net/wireless/ath/ath11k/pci.c b/drivers/net/wireless/ath/ath11k/pci.c
index 99cf3357c66e..87536327e214 100644
--- a/drivers/net/wireless/ath/ath11k/pci.c
+++ b/drivers/net/wireless/ath/ath11k/pci.c
@@ -8,6 +8,7 @@ 
 #include <linux/msi.h>
 #include <linux/pci.h>
 #include <linux/of.h>
+#include <linux/suspend.h>
 
 #include "pci.h"
 #include "core.h"
@@ -27,6 +28,8 @@ 
 #define QCN9074_DEVICE_ID		0x1104
 #define WCN6855_DEVICE_ID		0x1103
 
+#define WCN6855_S2IDLE_VER		0x110b196e
+
 static const struct pci_device_id ath11k_pci_id_table[] = {
 	{ PCI_VDEVICE(QCOM, QCA6390_DEVICE_ID) },
 	{ PCI_VDEVICE(QCOM, WCN6855_DEVICE_ID) },
@@ -965,6 +968,27 @@  static void ath11k_pci_shutdown(struct pci_dev *pdev)
 	ath11k_pci_power_down(ab);
 }
 
+static void ath11k_check_s2idle_bug(struct ath11k_base *ab)
+{
+	struct pci_dev *rdev;
+
+	if (pm_suspend_target_state != PM_SUSPEND_TO_IDLE)
+		return;
+
+	if (ab->id.device != WCN6855_DEVICE_ID)
+		return;
+
+	if (ab->qmi.target.fw_version >= WCN6855_S2IDLE_VER)
+		return;
+
+	rdev = pci_get_domain_bus_and_slot(0, 0, PCI_DEVFN(0, 0));
+	if (rdev->vendor == PCI_VENDOR_ID_AMD)
+		ath11k_warn(ab, "fw_version 0x%x may cause spurious wakeups. Upgrade to 0x%x or later.",
+			    ab->qmi.target.fw_version, WCN6855_S2IDLE_VER);
+
+	pci_dev_put(rdev);
+}
+
 static __maybe_unused int ath11k_pci_pm_suspend(struct device *dev)
 {
 	struct ath11k_base *ab = dev_get_drvdata(dev);
@@ -975,6 +999,8 @@  static __maybe_unused int ath11k_pci_pm_suspend(struct device *dev)
 		return 0;
 	}
 
+	ath11k_check_s2idle_bug(ab);
+
 	ret = ath11k_core_suspend(ab);
 	if (ret)
 		ath11k_warn(ab, "failed to suspend core: %d\n", ret);