diff mbox series

brcmfmac: pcie: fix oops on failure to resume and reprobe

Message ID 20210817063521.22450-1-a.fatoum@pengutronix.de (mailing list archive)
State Awaiting Upstream
Delegated to: Netdev Maintainers
Headers show
Series brcmfmac: pcie: fix oops on failure to resume and reprobe | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Ahmad Fatoum Aug. 17, 2021, 6:35 a.m. UTC
When resuming from suspend, brcmf_pcie_pm_leave_D3 will first attempt a
hot resume and then fall back to removing the PCI device and then
reprobing. If this probe fails, the kernel will oops, because brcmf_err,
which is called to report the failure will dereference the stale bus
pointer. Open code and use the default bus-less brcmf_err to avoid this.

Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
To: Arend van Spriel <aspriel@gmail.com>
To: Franky Lin <franky.lin@broadcom.com>
To: Hante Meuleman <hante.meuleman@broadcom.com>
To: Chi-hsien Lin <chi-hsien.lin@infineon.com>
To: Wright Feng <wright.feng@infineon.com>
To: Chung-hsien Hsu <chung-hsien.hsu@infineon.com>
Cc: SHA-cyfmac-dev-list@infineon.com
Cc: brcm80211-dev-list.pdl@broadcom.com
Cc: netdev@vger.kernel.org
Cc: linux-wireless@vger.kernel.org
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-kernel@vger.kernel.org
---
 drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Ahmad Fatoum Aug. 17, 2021, 10:01 a.m. UTC | #1
On 17.08.21 08:35, Ahmad Fatoum wrote:
> When resuming from suspend, brcmf_pcie_pm_leave_D3 will first attempt a
> hot resume and then fall back to removing the PCI device and then
> reprobing. If this probe fails, the kernel will oops, because brcmf_err,
> which is called to report the failure will dereference the stale bus
> pointer. Open code and use the default bus-less brcmf_err to avoid this.

Should've included a Fixes tag:

Fixes: 8602e62441ab ("brcmfmac: pass bus to the __brcmf_err() in pcie.c")

Please let me know if I should resend with the tag added.

Cheers,
Ahmad
 
> Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
> ---
> To: Arend van Spriel <aspriel@gmail.com>
> To: Franky Lin <franky.lin@broadcom.com>
> To: Hante Meuleman <hante.meuleman@broadcom.com>
> To: Chi-hsien Lin <chi-hsien.lin@infineon.com>
> To: Wright Feng <wright.feng@infineon.com>
> To: Chung-hsien Hsu <chung-hsien.hsu@infineon.com>
> Cc: SHA-cyfmac-dev-list@infineon.com
> Cc: brcm80211-dev-list.pdl@broadcom.com
> Cc: netdev@vger.kernel.org
> Cc: linux-wireless@vger.kernel.org
> Cc: Kalle Valo <kvalo@codeaurora.org>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: linux-kernel@vger.kernel.org
> ---
>  drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c
> index 9ef94d7a7ca7..d824bea4b79d 100644
> --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c
> +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c
> @@ -2209,7 +2209,7 @@ static int brcmf_pcie_pm_leave_D3(struct device *dev)
>  
>  	err = brcmf_pcie_probe(pdev, NULL);
>  	if (err)
> -		brcmf_err(bus, "probe after resume failed, err=%d\n", err);
> +		__brcmf_err(NULL, __func__, "probe after resume failed, err=%d\n", err);
>  
>  	return err;
>  }
>
Ahmad Fatoum Aug. 17, 2021, 11:11 a.m. UTC | #2
On 17.08.21 13:02, Andy Shevchenko wrote:
> On Tuesday, August 17, 2021, Ahmad Fatoum <a.fatoum@pengutronix.de> wrote:
> 
>> When resuming from suspend, brcmf_pcie_pm_leave_D3 will first attempt a
>> hot resume and then fall back to removing the PCI device and then
>> reprobing. If this probe fails, the kernel will oops, because brcmf_err,
>> which is called to report the failure will dereference the stale bus
>> pointer. Open code and use the default bus-less brcmf_err to avoid this.
>>
>> Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
>> ---
>> To: Arend van Spriel <aspriel@gmail.com>
>> To: Franky Lin <franky.lin@broadcom.com>
>> To: Hante Meuleman <hante.meuleman@broadcom.com>
>> To: Chi-hsien Lin <chi-hsien.lin@infineon.com>
>> To: Wright Feng <wright.feng@infineon.com>
>> To: Chung-hsien Hsu <chung-hsien.hsu@infineon.com>
>> Cc: SHA-cyfmac-dev-list@infineon.com
>> Cc: brcm80211-dev-list.pdl@broadcom.com
>> Cc: netdev@vger.kernel.org
>> Cc: linux-wireless@vger.kernel.org
>> Cc: Kalle Valo <kvalo@codeaurora.org>
>> Cc: Jakub Kicinski <kuba@kernel.org>
>> Cc: "David S. Miller" <davem@davemloft.net>
>> Cc: linux-kernel@vger.kernel.org
>> ---
>>  drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c
>> b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c
>> index 9ef94d7a7ca7..d824bea4b79d 100644
>> --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c
>> +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c
>> @@ -2209,7 +2209,7 @@ static int brcmf_pcie_pm_leave_D3(struct device *dev)
>>
>>         err = brcmf_pcie_probe(pdev, NULL);
>>         if (err)
>> -               brcmf_err(bus, "probe after resume failed, err=%d\n", err);
>> +               __brcmf_err(NULL, __func__, "probe after resume failed,
>> err=%d\n",
> 
> 
> This is weird looking line now. Why can’t you simply use dev_err() /
> netdev_err()?

That's what brcmf_err normally expands to, but in this file the macro
is overridden to add the extra first argument.

The brcmf_ logging function write to brcmf trace buffers. This is not
done with netdev_err/dev_err (and replacing the existing logging
is out of scope for a regression fix anyway).

Cheers,
Ahmad

> 
> 
>>
>>         return err;
>>  }
>> --
>> 2.30.2
>>
>>
>
Andy Shevchenko Aug. 17, 2021, 11:54 a.m. UTC | #3
On Tue, Aug 17, 2021 at 2:11 PM Ahmad Fatoum <a.fatoum@pengutronix.de> wrote:
> On 17.08.21 13:02, Andy Shevchenko wrote:
> > On Tuesday, August 17, 2021, Ahmad Fatoum <a.fatoum@pengutronix.de> wrote:

...

> >>         err = brcmf_pcie_probe(pdev, NULL);
> >>         if (err)
> >> -               brcmf_err(bus, "probe after resume failed, err=%d\n", err);
> >> +               __brcmf_err(NULL, __func__, "probe after resume failed,
> >> err=%d\n",
> >
> >
> > This is weird looking line now. Why can’t you simply use dev_err() /
> > netdev_err()?
>
> That's what brcmf_err normally expands to, but in this file the macro
> is overridden to add the extra first argument.

So, then the problem is in macro here. You need another portion of
macro(s) that will use the dev pointer directly. When you have a valid
device, use it. And here it seems the case.

> The brcmf_ logging function write to brcmf trace buffers. This is not
> done with netdev_err/dev_err (and replacing the existing logging
> is out of scope for a regression fix anyway).

I see.
Ahmad Fatoum Aug. 17, 2021, 12:03 p.m. UTC | #4
On 17.08.21 13:54, Andy Shevchenko wrote:
> On Tue, Aug 17, 2021 at 2:11 PM Ahmad Fatoum <a.fatoum@pengutronix.de> wrote:
>> On 17.08.21 13:02, Andy Shevchenko wrote:
>>> On Tuesday, August 17, 2021, Ahmad Fatoum <a.fatoum@pengutronix.de> wrote:
> 
> ...
> 
>>>>         err = brcmf_pcie_probe(pdev, NULL);
>>>>         if (err)
>>>> -               brcmf_err(bus, "probe after resume failed, err=%d\n", err);
>>>> +               __brcmf_err(NULL, __func__, "probe after resume failed,
>>>> err=%d\n",
>>>
>>>
>>> This is weird looking line now. Why can’t you simply use dev_err() /
>>> netdev_err()?
>>
>> That's what brcmf_err normally expands to, but in this file the macro
>> is overridden to add the extra first argument.
> 
> So, then the problem is in macro here. You need another portion of
> macro(s) that will use the dev pointer directly. When you have a valid
> device, use it. And here it seems the case.

Ah, you mean using pdev instead of the stale bus. Ye, I could do that.
Thanks for pointing out.

> 
>> The brcmf_ logging function write to brcmf trace buffers. This is not
>> done with netdev_err/dev_err (and replacing the existing logging
>> is out of scope for a regression fix anyway).
> 
> I see.
>
Ahmad Fatoum Aug. 17, 2021, 12:07 p.m. UTC | #5
On 17.08.21 14:03, Ahmad Fatoum wrote:
> On 17.08.21 13:54, Andy Shevchenko wrote:
>> On Tue, Aug 17, 2021 at 2:11 PM Ahmad Fatoum <a.fatoum@pengutronix.de> wrote:
>>> On 17.08.21 13:02, Andy Shevchenko wrote:
>>>> On Tuesday, August 17, 2021, Ahmad Fatoum <a.fatoum@pengutronix.de> wrote:
>>
>> ...
>>
>>>>>         err = brcmf_pcie_probe(pdev, NULL);
>>>>>         if (err)
>>>>> -               brcmf_err(bus, "probe after resume failed, err=%d\n", err);
>>>>> +               __brcmf_err(NULL, __func__, "probe after resume failed,
>>>>> err=%d\n",
>>>>
>>>>
>>>> This is weird looking line now. Why can’t you simply use dev_err() /
>>>> netdev_err()?
>>>
>>> That's what brcmf_err normally expands to, but in this file the macro
>>> is overridden to add the extra first argument.
>>
>> So, then the problem is in macro here. You need another portion of
>> macro(s) that will use the dev pointer directly. When you have a valid
>> device, use it. And here it seems the case.
> 
> Ah, you mean using pdev instead of the stale bus. Ye, I could do that.
> Thanks for pointing out.

Ah, not so easy: __brcmf_err accepts a struct brcmf_bus * as first argument,
but there is none I can pass along. As the whole file uses the brcm_
logging functions, I'd just leave this one without a device.

> 
>>
>>> The brcmf_ logging function write to brcmf trace buffers. This is not
>>> done with netdev_err/dev_err (and replacing the existing logging
>>> is out of scope for a regression fix anyway).
>>
>> I see.
>>
> 
>
Andy Shevchenko Aug. 17, 2021, 1:06 p.m. UTC | #6
On Tue, Aug 17, 2021 at 3:07 PM Ahmad Fatoum <a.fatoum@pengutronix.de> wrote:
> On 17.08.21 14:03, Ahmad Fatoum wrote:
> > On 17.08.21 13:54, Andy Shevchenko wrote:
> >> On Tue, Aug 17, 2021 at 2:11 PM Ahmad Fatoum <a.fatoum@pengutronix.de> wrote:
> >>> On 17.08.21 13:02, Andy Shevchenko wrote:
> >>>> On Tuesday, August 17, 2021, Ahmad Fatoum <a.fatoum@pengutronix.de> wrote:

...

> >>>>>         err = brcmf_pcie_probe(pdev, NULL);
> >>>>>         if (err)
> >>>>> -               brcmf_err(bus, "probe after resume failed, err=%d\n", err);
> >>>>> +               __brcmf_err(NULL, __func__, "probe after resume failed,
> >>>>> err=%d\n",
> >>>>
> >>>>
> >>>> This is weird looking line now. Why can’t you simply use dev_err() /
> >>>> netdev_err()?
> >>>
> >>> That's what brcmf_err normally expands to, but in this file the macro
> >>> is overridden to add the extra first argument.
> >>
> >> So, then the problem is in macro here. You need another portion of
> >> macro(s) that will use the dev pointer directly. When you have a valid
> >> device, use it. And here it seems the case.
> >
> > Ah, you mean using pdev instead of the stale bus. Ye, I could do that.
> > Thanks for pointing out.
>
> Ah, not so easy: __brcmf_err accepts a struct brcmf_bus * as first argument,
> but there is none I can pass along. As the whole file uses the brcm_
> logging functions, I'd just leave this one without a device.

And what exactly prevents you to split that to something like

__brcm_dev_err() // as current __brcm_err with dev argument
{
...
}

__brsm_err(bus, ...)  __brcm_dev_err(bus->dev, ...)

?
Ahmad Fatoum Aug. 17, 2021, 1:19 p.m. UTC | #7
On 17.08.21 15:06, Andy Shevchenko wrote:
> On Tue, Aug 17, 2021 at 3:07 PM Ahmad Fatoum <a.fatoum@pengutronix.de> wrote:
>> On 17.08.21 14:03, Ahmad Fatoum wrote:
>>> On 17.08.21 13:54, Andy Shevchenko wrote:
>>>> On Tue, Aug 17, 2021 at 2:11 PM Ahmad Fatoum <a.fatoum@pengutronix.de> wrote:
>>>>> On 17.08.21 13:02, Andy Shevchenko wrote:
>>>>>> On Tuesday, August 17, 2021, Ahmad Fatoum <a.fatoum@pengutronix.de> wrote:
> 
> ...
> 
>>>>>>>         err = brcmf_pcie_probe(pdev, NULL);
>>>>>>>         if (err)
>>>>>>> -               brcmf_err(bus, "probe after resume failed, err=%d\n", err);
>>>>>>> +               __brcmf_err(NULL, __func__, "probe after resume failed,
>>>>>>> err=%d\n",
>>>>>>
>>>>>>
>>>>>> This is weird looking line now. Why can’t you simply use dev_err() /
>>>>>> netdev_err()?
>>>>>
>>>>> That's what brcmf_err normally expands to, but in this file the macro
>>>>> is overridden to add the extra first argument.
>>>>
>>>> So, then the problem is in macro here. You need another portion of
>>>> macro(s) that will use the dev pointer directly. When you have a valid
>>>> device, use it. And here it seems the case.
>>>
>>> Ah, you mean using pdev instead of the stale bus. Ye, I could do that.
>>> Thanks for pointing out.
>>
>> Ah, not so easy: __brcmf_err accepts a struct brcmf_bus * as first argument,
>> but there is none I can pass along. As the whole file uses the brcm_
>> logging functions, I'd just leave this one without a device.
> 
> And what exactly prevents you to split that to something like
> 
> __brcm_dev_err() // as current __brcm_err with dev argument
> {
> ...
> }
> 
> __brsm_err(bus, ...)  __brcm_dev_err(bus->dev, ...)
> 
> ?

I like my regression fixes to be short and to the point.

Cheers,
Ahmad
Kalle Valo Aug. 29, 2021, 11:45 a.m. UTC | #8
Ahmad Fatoum <a.fatoum@pengutronix.de> wrote:

> When resuming from suspend, brcmf_pcie_pm_leave_D3 will first attempt a
> hot resume and then fall back to removing the PCI device and then
> reprobing. If this probe fails, the kernel will oops, because brcmf_err,
> which is called to report the failure will dereference the stale bus
> pointer. Open code and use the default bus-less brcmf_err to avoid this.
> 
> Fixes: 8602e62441ab ("brcmfmac: pass bus to the __brcmf_err() in pcie.c")
> Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>

Patch applied to wireless-drivers-next.git, thanks.

d745ca4f2c4a brcmfmac: pcie: fix oops on failure to resume and reprobe
diff mbox series

Patch

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c
index 9ef94d7a7ca7..d824bea4b79d 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c
@@ -2209,7 +2209,7 @@  static int brcmf_pcie_pm_leave_D3(struct device *dev)
 
 	err = brcmf_pcie_probe(pdev, NULL);
 	if (err)
-		brcmf_err(bus, "probe after resume failed, err=%d\n", err);
+		__brcmf_err(NULL, __func__, "probe after resume failed, err=%d\n", err);
 
 	return err;
 }