Message ID | 1496225353-5544-1-git-send-email-baijiaju1990@163.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Kalle Valo |
Headers | show |
Jia-Ju Bai <baijiaju1990@163.com> writes: > The driver may sleep under a spin lock, and the function call path is: > b43legacy_op_bss_info_changed (acquire the lock by spin_lock_irqsave) > b43legacy_synchronize_irq > synchronize_irq --> may sleep > > To fix it, the lock is released before b43legacy_synchronize_irq, and the > lock is acquired again after this function. > > Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com> > --- > drivers/net/wireless/broadcom/b43legacy/main.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/net/wireless/broadcom/b43legacy/main.c b/drivers/net/wireless/broadcom/b43legacy/main.c > index f1e3dad..31ead21 100644 > --- a/drivers/net/wireless/broadcom/b43legacy/main.c > +++ b/drivers/net/wireless/broadcom/b43legacy/main.c > @@ -2859,7 +2859,9 @@ static void b43legacy_op_bss_info_changed(struct ieee80211_hw *hw, > b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0); > > if (changed & BSS_CHANGED_BSSID) { > + spin_unlock_irqrestore(&wl->irq_lock, flags); > b43legacy_synchronize_irq(dev); > + spin_lock_irqsave(&wl->irq_lock, flags); To me this looks like a fragile workaround and not a real fix. You can easily add new race conditions with releasing the lock like this.
On 31-05-17 12:26, Kalle Valo wrote: > Jia-Ju Bai <baijiaju1990@163.com> writes: > >> The driver may sleep under a spin lock, and the function call path is: >> b43legacy_op_bss_info_changed (acquire the lock by spin_lock_irqsave) >> b43legacy_synchronize_irq >> synchronize_irq --> may sleep >> >> To fix it, the lock is released before b43legacy_synchronize_irq, and the >> lock is acquired again after this function. >> >> Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com> >> --- >> drivers/net/wireless/broadcom/b43legacy/main.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/drivers/net/wireless/broadcom/b43legacy/main.c b/drivers/net/wireless/broadcom/b43legacy/main.c >> index f1e3dad..31ead21 100644 >> --- a/drivers/net/wireless/broadcom/b43legacy/main.c >> +++ b/drivers/net/wireless/broadcom/b43legacy/main.c >> @@ -2859,7 +2859,9 @@ static void b43legacy_op_bss_info_changed(struct ieee80211_hw *hw, >> b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0); >> >> if (changed & BSS_CHANGED_BSSID) { >> + spin_unlock_irqrestore(&wl->irq_lock, flags); >> b43legacy_synchronize_irq(dev); >> + spin_lock_irqsave(&wl->irq_lock, flags); > > To me this looks like a fragile workaround and not a real fix. You can > easily add new race conditions with releasing the lock like this. Hi Jia-Ju, Agree with Kalle as I was about to say the same thing. You really need to determine what is protected by the irq_lock. Here you are using the lock because you are about to change wl->bssid a bit further down. Did not check the entire function but it seems the lock perimeter is too wide. Regards, Arend
On Wed, 31 May 2017 13:26:43 +0300 Kalle Valo <kvalo@codeaurora.org> wrote: > Jia-Ju Bai <baijiaju1990@163.com> writes: > > > The driver may sleep under a spin lock, and the function call path is: > > b43legacy_op_bss_info_changed (acquire the lock by spin_lock_irqsave) > > b43legacy_synchronize_irq > > synchronize_irq --> may sleep > > > > To fix it, the lock is released before b43legacy_synchronize_irq, and the > > lock is acquired again after this function. > > > > Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com> > > --- > > drivers/net/wireless/broadcom/b43legacy/main.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/drivers/net/wireless/broadcom/b43legacy/main.c b/drivers/net/wireless/broadcom/b43legacy/main.c > > index f1e3dad..31ead21 100644 > > --- a/drivers/net/wireless/broadcom/b43legacy/main.c > > +++ b/drivers/net/wireless/broadcom/b43legacy/main.c > > @@ -2859,7 +2859,9 @@ static void b43legacy_op_bss_info_changed(struct ieee80211_hw *hw, > > b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0); > > > > if (changed & BSS_CHANGED_BSSID) { > > + spin_unlock_irqrestore(&wl->irq_lock, flags); > > b43legacy_synchronize_irq(dev); > > + spin_lock_irqsave(&wl->irq_lock, flags); > > To me this looks like a fragile workaround and not a real fix. You can > easily add new race conditions with releasing the lock like this. > I think releasing the lock possibly is fine. It certainly is better than sleeping with a lock held. We disabled the device interrupts just before this line. However I think the synchronize_irq should be outside of the conditional right after the write to B43legacy_MMIO_GEN_IRQ_MASK. (So two lines above) I don't think it makes sense to only synchronize if BSS_CHANGED_BSSID is set. On the other hand b43 does not have this irq-disabling foobar anymore. So somebody must have removed it. Maybe you can find the commit that removed this stuff from b43 and port it to b43legacy? So I would vote for moving the synchronize_irq up outside of the conditional and put the unlock/lock sequence around it. And as a second patch on top of that try to remove this stuff altogether like b43 did.
On 05/31/2017 10:32 AM, Michael Büsch wrote: > On Wed, 31 May 2017 13:26:43 +0300 > Kalle Valo <kvalo@codeaurora.org> wrote: > >> Jia-Ju Bai <baijiaju1990@163.com> writes: >> >>> The driver may sleep under a spin lock, and the function call path is: >>> b43legacy_op_bss_info_changed (acquire the lock by spin_lock_irqsave) >>> b43legacy_synchronize_irq >>> synchronize_irq --> may sleep >>> >>> To fix it, the lock is released before b43legacy_synchronize_irq, and the >>> lock is acquired again after this function. >>> >>> Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com> >>> --- >>> drivers/net/wireless/broadcom/b43legacy/main.c | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> diff --git a/drivers/net/wireless/broadcom/b43legacy/main.c b/drivers/net/wireless/broadcom/b43legacy/main.c >>> index f1e3dad..31ead21 100644 >>> --- a/drivers/net/wireless/broadcom/b43legacy/main.c >>> +++ b/drivers/net/wireless/broadcom/b43legacy/main.c >>> @@ -2859,7 +2859,9 @@ static void b43legacy_op_bss_info_changed(struct ieee80211_hw *hw, >>> b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0); >>> >>> if (changed & BSS_CHANGED_BSSID) { >>> + spin_unlock_irqrestore(&wl->irq_lock, flags); >>> b43legacy_synchronize_irq(dev); >>> + spin_lock_irqsave(&wl->irq_lock, flags); >> >> To me this looks like a fragile workaround and not a real fix. You can >> easily add new race conditions with releasing the lock like this. >> > > > I think releasing the lock possibly is fine. It certainly is better than > sleeping with a lock held. > We disabled the device interrupts just before this line. > > However I think the synchronize_irq should be outside of the > conditional right after the write to B43legacy_MMIO_GEN_IRQ_MASK. (So > two lines above) > I don't think it makes sense to only synchronize if BSS_CHANGED_BSSID > is set. > > > On the other hand b43 does not have this irq-disabling foobar anymore. > So somebody must have removed it. Maybe you can find the commit that > removed this stuff from b43 and port it to b43legacy? > > > So I would vote for moving the synchronize_irq up outside of the > conditional and put the unlock/lock sequence around it. > And as a second patch on top of that try to remove this stuff > altogether like b43 did. The patch that removed it in b43 is commit 36dbd9548e92268127b0c31b0e121e63e9207108 Author: Michael Buesch <mb@bu3sch.de> Date: Fri Sep 4 22:51:29 2009 +0200 b43: Use a threaded IRQ handler Use a threaded IRQ handler to allow locking the mutex and sleeping while executing an interrupt. This removes usage of the irq_lock spinlock, but introduces a new hardirq_lock, which is _only_ used for the PCI/SSB lowlevel hard-irq handler. Sleeping busses (SDIO) will use mutex instead. Signed-off-by: Michael Buesch <mb@bu3sch.de> Tested-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> I vaguely remember this patch. Although it is roughly a 1000-line fix, I will try to port it to b43legacy. I still have an old BCM4306 PCMCIA card that I can test in a PowerBook G4. I agree with Michael that this is the way to go. Both of Jia-Ju's patches should be rejected. Larry
On 06/01/2017 08:07 AM, Larry Finger wrote: > On 05/31/2017 10:32 AM, Michael Büsch wrote: >> On Wed, 31 May 2017 13:26:43 +0300 >> Kalle Valo <kvalo@codeaurora.org> wrote: >> >>> Jia-Ju Bai <baijiaju1990@163.com> writes: >>> >>>> The driver may sleep under a spin lock, and the function call path is: >>>> b43legacy_op_bss_info_changed (acquire the lock by spin_lock_irqsave) >>>> b43legacy_synchronize_irq >>>> synchronize_irq --> may sleep >>>> >>>> To fix it, the lock is released before b43legacy_synchronize_irq, >>>> and the >>>> lock is acquired again after this function. >>>> >>>> Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com> >>>> --- >>>> drivers/net/wireless/broadcom/b43legacy/main.c | 2 ++ >>>> 1 file changed, 2 insertions(+) >>>> >>>> diff --git a/drivers/net/wireless/broadcom/b43legacy/main.c >>>> b/drivers/net/wireless/broadcom/b43legacy/main.c >>>> index f1e3dad..31ead21 100644 >>>> --- a/drivers/net/wireless/broadcom/b43legacy/main.c >>>> +++ b/drivers/net/wireless/broadcom/b43legacy/main.c >>>> @@ -2859,7 +2859,9 @@ static void >>>> b43legacy_op_bss_info_changed(struct ieee80211_hw *hw, >>>> b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0); >>>> if (changed & BSS_CHANGED_BSSID) { >>>> + spin_unlock_irqrestore(&wl->irq_lock, flags); >>>> b43legacy_synchronize_irq(dev); >>>> + spin_lock_irqsave(&wl->irq_lock, flags); >>> >>> To me this looks like a fragile workaround and not a real fix. You can >>> easily add new race conditions with releasing the lock like this. >>> >> >> >> I think releasing the lock possibly is fine. It certainly is better than >> sleeping with a lock held. >> We disabled the device interrupts just before this line. >> >> However I think the synchronize_irq should be outside of the >> conditional right after the write to B43legacy_MMIO_GEN_IRQ_MASK. (So >> two lines above) >> I don't think it makes sense to only synchronize if BSS_CHANGED_BSSID >> is set. >> >> >> On the other hand b43 does not have this irq-disabling foobar anymore. >> So somebody must have removed it. Maybe you can find the commit that >> removed this stuff from b43 and port it to b43legacy? >> >> >> So I would vote for moving the synchronize_irq up outside of the >> conditional and put the unlock/lock sequence around it. >> And as a second patch on top of that try to remove this stuff >> altogether like b43 did. > > The patch that removed it in b43 is > > commit 36dbd9548e92268127b0c31b0e121e63e9207108 > Author: Michael Buesch <mb@bu3sch.de> > Date: Fri Sep 4 22:51:29 2009 +0200 > > b43: Use a threaded IRQ handler > > Use a threaded IRQ handler to allow locking the mutex and > sleeping while executing an interrupt. > This removes usage of the irq_lock spinlock, but introduces > a new hardirq_lock, which is _only_ used for the PCI/SSB lowlevel > hard-irq handler. Sleeping busses (SDIO) will use mutex instead. > > Signed-off-by: Michael Buesch <mb@bu3sch.de> > Tested-by: Larry Finger <Larry.Finger@lwfinger.net> > Signed-off-by: John W. Linville <linville@tuxdriver.com> > > I vaguely remember this patch. Although it is roughly a 1000-line fix, > I will try to port it to b43legacy. I still have an old BCM4306 PCMCIA > card that I can test in a PowerBook G4. > > I agree with Michael that this is the way to go. Both of Jia-Ju's > patches should be rejected. > > Larry > > It is fine to me to fix the bug by porting this former patch. Thanks, Jia-Ju Bai
Michael Büsch <m@bues.ch> writes: >> > --- a/drivers/net/wireless/broadcom/b43legacy/main.c >> > +++ b/drivers/net/wireless/broadcom/b43legacy/main.c >> > @@ -2859,7 +2859,9 @@ static void b43legacy_op_bss_info_changed(struct ieee80211_hw *hw, >> > b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0); >> > >> > if (changed & BSS_CHANGED_BSSID) { >> > + spin_unlock_irqrestore(&wl->irq_lock, flags); >> > b43legacy_synchronize_irq(dev); >> > + spin_lock_irqsave(&wl->irq_lock, flags); >> >> To me this looks like a fragile workaround and not a real fix. You can >> easily add new race conditions with releasing the lock like this. >> > > > I think releasing the lock possibly is fine. It certainly is better than > sleeping with a lock held. Sure, but IMHO in general I think the practise of releasing the lock like this in a middle of function is dangerous as one can easily miss that upper and lower halves of the function are not actually atomic anymore. And in this case that it's under a conditional makes it even worse.
On Thu, 01 Jun 2017 07:27:20 +0300 Kalle Valo <kvalo@codeaurora.org> wrote: > Michael Büsch <m@bues.ch> writes: > > >> > --- a/drivers/net/wireless/broadcom/b43legacy/main.c > >> > +++ b/drivers/net/wireless/broadcom/b43legacy/main.c > >> > @@ -2859,7 +2859,9 @@ static void b43legacy_op_bss_info_changed(struct ieee80211_hw *hw, > >> > b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0); > >> > > >> > if (changed & BSS_CHANGED_BSSID) { > >> > + spin_unlock_irqrestore(&wl->irq_lock, flags); > >> > b43legacy_synchronize_irq(dev); > >> > + spin_lock_irqsave(&wl->irq_lock, flags); > >> > >> To me this looks like a fragile workaround and not a real fix. You can > >> easily add new race conditions with releasing the lock like this. > >> > > > > > > I think releasing the lock possibly is fine. It certainly is better than > > sleeping with a lock held. > > Sure, but IMHO in general I think the practise of releasing the lock > like this in a middle of function is dangerous as one can easily miss > that upper and lower halves of the function are not actually atomic > anymore. And in this case that it's under a conditional makes it even > worse. > Yes in general I agree. Releasing and re-acquiring a lock is dangerous. But I think in this special case here it might be harmless. The irq_lock is used mostly (if not exclusively; I don't fully remember) to protect against the IRQ top and bottom half. But we disabled the device IRQs a line above and the purpose of this synchronize is to make sure the handler will finish and thus make dropping the lock save. Of course it does not make sense to do this with the lock held :)
On Wed, 31 May 2017 19:07:15 -0500 Larry Finger <Larry.Finger@lwfinger.net> wrote: > On 05/31/2017 10:32 AM, Michael Büsch wrote: > > On Wed, 31 May 2017 13:26:43 +0300 > > Kalle Valo <kvalo@codeaurora.org> wrote: > > > >> Jia-Ju Bai <baijiaju1990@163.com> writes: > >> > >>> The driver may sleep under a spin lock, and the function call path is: > >>> b43legacy_op_bss_info_changed (acquire the lock by spin_lock_irqsave) > >>> b43legacy_synchronize_irq > >>> synchronize_irq --> may sleep > >>> > >>> To fix it, the lock is released before b43legacy_synchronize_irq, and the > >>> lock is acquired again after this function. > >>> > >>> Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com> > >>> --- > >>> drivers/net/wireless/broadcom/b43legacy/main.c | 2 ++ > >>> 1 file changed, 2 insertions(+) > >>> > >>> diff --git a/drivers/net/wireless/broadcom/b43legacy/main.c b/drivers/net/wireless/broadcom/b43legacy/main.c > >>> index f1e3dad..31ead21 100644 > >>> --- a/drivers/net/wireless/broadcom/b43legacy/main.c > >>> +++ b/drivers/net/wireless/broadcom/b43legacy/main.c > >>> @@ -2859,7 +2859,9 @@ static void b43legacy_op_bss_info_changed(struct ieee80211_hw *hw, > >>> b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0); > >>> > >>> if (changed & BSS_CHANGED_BSSID) { > >>> + spin_unlock_irqrestore(&wl->irq_lock, flags); > >>> b43legacy_synchronize_irq(dev); > >>> + spin_lock_irqsave(&wl->irq_lock, flags); > >> > >> To me this looks like a fragile workaround and not a real fix. You can > >> easily add new race conditions with releasing the lock like this. > >> > > > > > > I think releasing the lock possibly is fine. It certainly is better than > > sleeping with a lock held. > > We disabled the device interrupts just before this line. > > > > However I think the synchronize_irq should be outside of the > > conditional right after the write to B43legacy_MMIO_GEN_IRQ_MASK. (So > > two lines above) > > I don't think it makes sense to only synchronize if BSS_CHANGED_BSSID > > is set. > > > > > > On the other hand b43 does not have this irq-disabling foobar anymore. > > So somebody must have removed it. Maybe you can find the commit that > > removed this stuff from b43 and port it to b43legacy? > > > > > > So I would vote for moving the synchronize_irq up outside of the > > conditional and put the unlock/lock sequence around it. > > And as a second patch on top of that try to remove this stuff > > altogether like b43 did. > > The patch that removed it in b43 is > > commit 36dbd9548e92268127b0c31b0e121e63e9207108 > Author: Michael Buesch <mb@bu3sch.de> > Date: Fri Sep 4 22:51:29 2009 +0200 Damn it :D > b43: Use a threaded IRQ handler > > Use a threaded IRQ handler to allow locking the mutex and > sleeping while executing an interrupt. > This removes usage of the irq_lock spinlock, but introduces > a new hardirq_lock, which is _only_ used for the PCI/SSB lowlevel > hard-irq handler. Sleeping busses (SDIO) will use mutex instead. > > Signed-off-by: Michael Buesch <mb@bu3sch.de> > Tested-by: Larry Finger <Larry.Finger@lwfinger.net> > Signed-off-by: John W. Linville <linville@tuxdriver.com> > > I vaguely remember this patch. Although it is roughly a 1000-line fix, I will > try to port it to b43legacy. I still have an old BCM4306 PCMCIA card that I can > test in a PowerBook G4. > > I agree with Michael that this is the way to go. Both of Jia-Ju's patches should > be rejected. I'm not sure if it's worth it. There is a risk that this would introduce new bugs. But sure, please feel free to try it. This way we can find out how big this change becomes.
diff --git a/drivers/net/wireless/broadcom/b43legacy/main.c b/drivers/net/wireless/broadcom/b43legacy/main.c index f1e3dad..31ead21 100644 --- a/drivers/net/wireless/broadcom/b43legacy/main.c +++ b/drivers/net/wireless/broadcom/b43legacy/main.c @@ -2859,7 +2859,9 @@ static void b43legacy_op_bss_info_changed(struct ieee80211_hw *hw, b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0); if (changed & BSS_CHANGED_BSSID) { + spin_unlock_irqrestore(&wl->irq_lock, flags); b43legacy_synchronize_irq(dev); + spin_lock_irqsave(&wl->irq_lock, flags); if (conf->bssid) memcpy(wl->bssid, conf->bssid, ETH_ALEN);
The driver may sleep under a spin lock, and the function call path is: b43legacy_op_bss_info_changed (acquire the lock by spin_lock_irqsave) b43legacy_synchronize_irq synchronize_irq --> may sleep To fix it, the lock is released before b43legacy_synchronize_irq, and the lock is acquired again after this function. Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com> --- drivers/net/wireless/broadcom/b43legacy/main.c | 2 ++ 1 file changed, 2 insertions(+)