| Submitter | Jiri Slaby |
|---|---|
| Date | 2009-01-07 14:36:05 |
| Message ID | <1231338965-796-1-git-send-email-jirislaby@gmail.com> |
| Download | mbox | patch |
| Permalink | /patch/1169/ |
| State | New |
| Headers | show |
Comments
On Wed, Jan 07, 2009 at 03:36:05PM +0100, Jiri Slaby wrote: > On 01/07/2009 02:51 PM, Jiri Slaby wrote: > > Dhaval Giani wrote: > >> I see this on current git. Not sure how to reproduce it, has happened on > >> two random occasions. At both times, I was not connected to a wireless > >> network, but to wired networks. > >> > >> ------------[ cut here ]------------ > >> WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0x7f/0x559 > >> ... > >> Call Trace: > >> [<f80d4192>] __ieee80211_rx+0x7f/0x559 [mac80211] > >> [<f80a19f4>] ath5k_tasklet_rx+0x4f7/0x53b [ath5k] > >> ... > > > > Hmm, maybe ath5k is culprit. Could you apply the attached patch and > > use the kernel till the problem appears again? > > I don't think this will print anything, the rate won't be 32, it's rather > too high. Could you apply also the appended debug one? > I will apply both the patches and try it out again. As I mentioned earlier, I am not sure how to reproduce the WARN_ON. I will get back to you in about a day or two. > --- > net/mac80211/rx.c | 6 ++++-- > 1 files changed, 4 insertions(+), 2 deletions(-) > > diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c > index 7175ae8..5e17e57 100644 > --- a/net/mac80211/rx.c > +++ b/net/mac80211/rx.c > @@ -2230,8 +2230,10 @@ void __ieee80211_rx(struct ieee80211_hw *hw, struct sk_buff *skb, > * MCS aware. */ > rate = &sband->bitrates[sband->n_bitrates - 1]; > } else { > - if (WARN_ON(status->rate_idx < 0 || > - status->rate_idx >= sband->n_bitrates)) > + if (WARN(status->rate_idx < 0 || > + status->rate_idx >= sband->n_bitrates, > + "RATE=%u, BAND=%x\n", status->rate_idx, > + sband->n_bitrates)) > return; > rate = &sband->bitrates[status->rate_idx]; > } > -- > 1.6.0.6
On 01/07/2009 04:22 PM, Dhaval Giani wrote:
> I will get back to you in about a day or two.
No problem. Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Hi Jiri, On Wed, Jan 7, 2009 at 9:00 PM, Jiri Slaby <jirislaby@gmail.com> wrote: > On 01/07/2009 04:22 PM, Dhaval Giani wrote: >> I will get back to you in about a day or two. > > No problem. Thanks. > So I finally managed to hit this on 2.6.29-rc3. It is hard to reproduce, so I hope so much information is enough to give you a good guess. This time it hit while trying to connect to an open network at the airport. Thanks, Dhaval
On Mon, Feb 02, 2009 at 01:27:39PM +0530, Dhaval Giani wrote: > So I finally managed to hit this on 2.6.29-rc3. It is hard to > reproduce, so I hope so much information is enough to give you a good > guess. This time it hit while trying to connect to an open network at > the airport. > WARNING: at net/mac80211/rx.c:2236 __ieee80211_rx+0x96/0x571 [mac80211]() > Hardware name: 2007CS3 > RATE=255, BAND=8 band is supposed to be sc->curband? 8 is way wrong. rate could be 255 if, for some reason, the hardware rate wasn't in the rate table. > Pid: 2634, comm: X Not tainted 2.6.29-rc3 #18 > Call Trace: > [<c0430636>] warn_slowpath+0x76/0xad > [<c04508d7>] ? __lock_acquire+0xb36/0xb45 > [<f7dd0205>] __ieee80211_rx+0x96/0x571 [mac80211] > [<f7e37976>] ath5k_tasklet_rx+0x4fb/0x53d [ath5k] > [<c06fa5c3>] ? _spin_unlock_irq+0x27/0x34 > [<c0434f91>] tasklet_action+0x85/0xf0 Interestingly I hit this just the other day -- I was debugging something else and had a serial console hooked up, otherwise I wouldn't have noticed at all. It happened at some point during a suspend-to-ram sequence, which makes at least my version sound like a race condition of some sort. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On 15.2.2009 14:47, Bob Copeland wrote: > On Mon, Feb 02, 2009 at 01:27:39PM +0530, Dhaval Giani wrote: >> So I finally managed to hit this on 2.6.29-rc3. It is hard to >> reproduce, so I hope so much information is enough to give you a good >> guess. This time it hit while trying to connect to an open network at >> the airport. > >> WARNING: at net/mac80211/rx.c:2236 __ieee80211_rx+0x96/0x571 [mac80211]() >> Hardware name: 2007CS3 >> RATE=255, BAND=8 > > band is supposed to be sc->curband? 8 is way wrong. If you look into the patch which outputs this (backtrace in this thread), sband->n_bitrates is 8. I have no idea what I have been smoking the day I wrote it, but BAND= for sure isn't the right name for that thing. Sorry for the confusion. > rate could be 255 > if, for some reason, the hardware rate wasn't in the rate table. So, we have a fix for this, right? I mean the u8->s8 sc->rate_idx conversion or alike... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Hi On Mittwoch, 7. Januar 2009, Jiri Slaby wrote: > On 01/07/2009 02:51 PM, Jiri Slaby wrote: > > Dhaval Giani wrote: > >> I see this on current git. Not sure how to reproduce it, has happened on > >> two random occasions. At both times, I was not connected to a wireless > >> network, but to wired networks. > >> > >> ------------[ cut here ]------------ > >> WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0x7f/0x559 > >> ... > >> Call Trace: > >> [<f80d4192>] __ieee80211_rx+0x7f/0x559 [mac80211] > >> [<f80a19f4>] ath5k_tasklet_rx+0x4f7/0x53b [ath5k] > >> ... > > > > Hmm, maybe ath5k is culprit. Could you apply the attached patch and > > use the kernel till the problem appears again? It seems as if this problem wouldn't be restricted to ath5k, I just triggered something very similar on b43 and 2.6.29-rc8-git1 (i386, hard preemption): b43-phy0: Broadcom 4306 WLAN found (core revision 5) wmaster0 (b43): not using net_device_ops yet phy0: Selected rate control algorithm 'minstrel' wlan0 (b43): not using net_device_ops yet Broadcom 43xx driver loaded [ Features: PMLR, Firmware-ID: FW13 ] udev: renamed network interface wlan0 to wlan1 [...] input: b43-phy0 as /devices/virtual/input/input8 b43 ssb0:0: firmware: requesting b43/ucode5.fw b43 ssb0:0: firmware: requesting b43/pcm5.fw b43 ssb0:0: firmware: requesting b43/b0g0initvals5.fw b43 ssb0:0: firmware: requesting b43/b0g0bsinitvals5.fw b43-phy0: Loading firmware version 410.2160 (2007-05-26 15:32:10) Registered led device: b43-phy0::tx Registered led device: b43-phy0::rx Registered led device: b43-phy0::radio b43-phy0: Radio turned on by software [...] ADDRCONF(NETDEV_UP): wlan1: link is not ready wlan1: authenticate with AP 00:15:f2:7e:9b:7d wlan1: authenticated wlan1: associate with AP 00:15:f2:7e:9b:7d wlan1: RX AssocResp from 00:15:f2:7e:9b:7d (capab=0x411 status=0 aid=2) wlan1: associated ADDRCONF(NETDEV_CHANGE): wlan1: link becomes ready [...] wlan1: no IPv6 routers present b43-phy0 ERROR: PHY transmission error b43-phy0 ERROR: PHY transmission error [ lots of these, likely to be caused by minstrel being a little too optimistic about the possible wlan rates (it was more conservative in 2.6.28 and didn't happen there); the distance between both stations is on the upper end ] b43-phy0 ERROR: PHY transmission error __ratelimit: 9 callbacks suppressed b43-phy0 ERROR: PHY transmission error b43-phy0 ERROR: PHY transmission error ------------[ cut here ]------------ WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0xa2/0x6a0 [mac80211]() Hardware name: Amilo D-Series Modules linked in: ppdev lp aes_i586 aes_generic ipv6 af_packet rfkill_input arc4 ecb b43 rfkill rng_core mac80211 cfg80211 led_class input_polldev ssb joydev pcmcia snd_via82xx gameport snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device i2c_viapro serio_raw snd i2c_core pcspkr psmouse evdev soundcore via686a via_agp shpchp yenta_socket rsrc_nonstatic pcmcia_core pci_hotplug rtc_cmos battery rtc_core rtc_lib parport_pc parport ac button ext3 jbd mbcache sg sr_mod cdrom sd_mod ata_generic pata_acpi pata_via uhci_hcd ehci_hcd floppy firewire_ohci libata tulip firewire_core crc_itu_t usbcore scsi_mod thermal processor fan Pid: 0, comm: swapper Not tainted 2.6.29-rc8-sidux-686 #1 Call Trace: [<c01319d7>] warn_slowpath+0x87/0xe0 [<d00523b7>] op32_set_current_rxslot+0x27/0x40 [b43] [<d0052d93>] b43_dma_rx+0x193/0x420 [b43] [<c0124fc3>] __wake_up_common+0x43/0x70 [<cfffcc62>] __ieee80211_rx+0xa2/0x6a0 [mac80211] [<c011e9a5>] default_spin_lock_flags+0x5/0x10 [<c03a3f2e>] _spin_lock_irqsave+0x3e/0x60 [<cffeb337>] ieee80211_tasklet_handler+0x107/0x130 [mac80211] [<c013692c>] tasklet_action+0x6c/0xf0 [<c0137147>] __do_softirq+0x87/0x140 [<c011e9a5>] default_spin_lock_flags+0x5/0x10 [<c03a3f2e>] _spin_lock_irqsave+0x3e/0x60 [<c0137255>] do_softirq+0x55/0x60 [<c0137495>] irq_exit+0x75/0x90 [<c0106378>] do_IRQ+0x48/0x90 [<c0104527>] common_interrupt+0x27/0x2c [<cf8372e4>] acpi_idle_enter_simple+0x17a/0x1f4 [processor] [<c02fd3bf>] cpuidle_idle_call+0x6f/0xc0 [<c0102de6>] cpu_idle+0x66/0xa0 ---[ end trace c754f566bbe5ac47 ]--- ------------[ cut here ]------------ WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0xa2/0x6a0 [mac80211]() Hardware name: Amilo D-Series Modules linked in: ppdev lp aes_i586 aes_generic ipv6 af_packet rfkill_input arc4 ecb b43 rfkill rng_core mac80211 cfg80211 led_class input_polldev ssb joydev pcmcia snd_via82xx gameport snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device i2c_viapro serio_raw snd i2c_core pcspkr psmouse evdev soundcore via686a via_agp shpchp yenta_socket rsrc_nonstatic pcmcia_core pci_hotplug rtc_cmos battery rtc_core rtc_lib parport_pc parport ac button ext3 jbd mbcache sg sr_mod cdrom sd_mod ata_generic pata_acpi pata_via uhci_hcd ehci_hcd floppy firewire_ohci libata tulip firewire_core crc_itu_t usbcore scsi_mod thermal processor fan Pid: 0, comm: swapper Tainted: G W 2.6.29-rc8-sidux-686 #1 Call Trace: [<c01319d7>] warn_slowpath+0x87/0xe0 [<d00523b7>] op32_set_current_rxslot+0x27/0x40 [b43] [<d0052d93>] b43_dma_rx+0x193/0x420 [b43] [<d0055f15>] b43_led_turn_off+0x55/0x90 [b43] [<cfffcc62>] __ieee80211_rx+0xa2/0x6a0 [mac80211] [<c011e9a5>] default_spin_lock_flags+0x5/0x10 [<c03a3f2e>] _spin_lock_irqsave+0x3e/0x60 [<cffeb337>] ieee80211_tasklet_handler+0x107/0x130 [mac80211] [<c013692c>] tasklet_action+0x6c/0xf0 [<c0137147>] __do_softirq+0x87/0x140 [<c011e9a5>] default_spin_lock_flags+0x5/0x10 [<c03a3f2e>] _spin_lock_irqsave+0x3e/0x60 [<c0137255>] do_softirq+0x55/0x60 [<c0137495>] irq_exit+0x75/0x90 [<c0106378>] do_IRQ+0x48/0x90 [<c0104527>] common_interrupt+0x27/0x2c [<cf8372e4>] acpi_idle_enter_simple+0x17a/0x1f4 [processor] [<c02fd3bf>] cpuidle_idle_call+0x6f/0xc0 [<c0102de6>] cpu_idle+0x66/0xa0 ---[ end trace c754f566bbe5ac48 ]--- ------------[ cut here ]------------ WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0xa2/0x6a0 [mac80211]() Hardware name: Amilo D-Series Modules linked in: ppdev lp aes_i586 aes_generic ipv6 af_packet rfkill_input arc4 ecb b43 rfkill rng_core mac80211 cfg80211 led_class input_polldev ssb joydev pcmcia snd_via82xx gameport snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device i2c_viapro serio_raw snd i2c_core pcspkr psmouse evdev soundcore via686a via_agp shpchp yenta_socket rsrc_nonstatic pcmcia_core pci_hotplug rtc_cmos battery rtc_core rtc_lib parport_pc parport ac button ext3 jbd mbcache sg sr_mod cdrom sd_mod ata_generic pata_acpi pata_via uhci_hcd ehci_hcd floppy firewire_ohci libata tulip firewire_core crc_itu_t usbcore scsi_mod thermal processor fan Pid: 1873, comm: kjournald Tainted: G W 2.6.29-rc8-sidux-686 #1 Call Trace: [<c01319d7>] warn_slowpath+0x87/0xe0 [<d00523b7>] op32_set_current_rxslot+0x27/0x40 [b43] [<d0052d93>] b43_dma_rx+0x193/0x420 [b43] [<cfffcc62>] __ieee80211_rx+0xa2/0x6a0 [mac80211] [<c011e9a5>] default_spin_lock_flags+0x5/0x10 [<c03a3f2e>] _spin_lock_irqsave+0x3e/0x60 [<cffeb337>] ieee80211_tasklet_handler+0x107/0x130 [mac80211] [<c013692c>] tasklet_action+0x6c/0xf0 [<c0137147>] __do_softirq+0x87/0x140 [<c011e9a5>] default_spin_lock_flags+0x5/0x10 [<c03a3f2e>] _spin_lock_irqsave+0x3e/0x60 [<c0137255>] do_softirq+0x55/0x60 [<c0137495>] irq_exit+0x75/0x90 [<c0106378>] do_IRQ+0x48/0x90 [<c01d3f44>] generic_block_bmap+0x54/0x70 [<c0104527>] common_interrupt+0x27/0x2c [<cfbf723c>] __journal_file_buffer+0xdc/0x1d0 [jbd] [<cfbf7397>] journal_file_buffer+0x67/0xc0 [jbd] [<cfbfe102>] journal_write_metadata_buffer+0x1e2/0x3dc [jbd] [<cfbf9e26>] journal_commit_transaction+0x806/0x1120 [jbd] [<c013bcc7>] lock_timer_base+0x27/0x60 [<cfbfd82c>] kjournald+0xac/0x1f0 [jbd] [<c01464b0>] autoremove_wake_function+0x0/0x50 [<cfbfd780>] kjournald+0x0/0x1f0 [jbd] [<c01460e9>] kthread+0x39/0x70 [<c01460b0>] kthread+0x0/0x70 [<c0104793>] kernel_thread_helper+0x7/0x14 ---[ end trace c754f566bbe5ac49 ]--- __ratelimit: 21 callbacks suppressed b43-phy0 ERROR: PHY transmission error [...] Sometimes even the firmware crashes and gets reloaded continously. wlan1 IEEE 802.11bg ESSID:"soyuz" Mode:Managed Frequency:2.422 GHz Access Point: 00:15:F2:7E:9B:7D Bit Rate=18 Mb/s Tx-Power=20 dBm Retry min limit:7 RTS thr:off Fragment thr=2352 B Encryption key:<wpa2psk> [3] Security mode:open Power Management:off Link Quality=53/100 Signal level:-75 dBm Noise level=-65 dBm Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0 Tx excessive retries:0 Invalid misc:0 Missed beacon:0 Setting a fixed wlan rate (like 11M) seems to avoid this problem. > I don't think this will print anything, the rate won't be 32, it's rather > too high. Could you apply also the appended debug one? I will apply this patch and give it some more testing tomorrow evening, this problem is almost 100% reproducable for me at the end of my router's range and doesn't happen in closer proximity. > --- > net/mac80211/rx.c | 6 ++++-- > 1 files changed, 4 insertions(+), 2 deletions(-) > > diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c > index 7175ae8..5e17e57 100644 > --- a/net/mac80211/rx.c > +++ b/net/mac80211/rx.c > @@ -2230,8 +2230,10 @@ void __ieee80211_rx(struct ieee80211_hw *hw, struct sk_buff *skb, > * MCS aware. */ > rate = &sband->bitrates[sband->n_bitrates - 1]; > } else { > - if (WARN_ON(status->rate_idx < 0 || > - status->rate_idx >= sband->n_bitrates)) > + if (WARN(status->rate_idx < 0 || > + status->rate_idx >= sband->n_bitrates, > + "RATE=%u, BAND=%x\n", status->rate_idx, > + sband->n_bitrates)) > return; > rate = &sband->bitrates[status->rate_idx]; > } Regards Stefan Lippers-Hollmann -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On Sunday 15 March 2009 22:27:13 Stefan Lippers-Hollmann wrote: > Hi > > On Mittwoch, 7. Januar 2009, Jiri Slaby wrote: > > On 01/07/2009 02:51 PM, Jiri Slaby wrote: > > > Dhaval Giani wrote: > > >> I see this on current git. Not sure how to reproduce it, has happened on > > >> two random occasions. At both times, I was not connected to a wireless > > >> network, but to wired networks. > > >> > > >> ------------[ cut here ]------------ > > >> WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0x7f/0x559 I also see this triggering frequently on b43. I'm not quite sure why it happens. > Sometimes even the firmware crashes and gets reloaded continously. Nah, that's most likely a separate bug.
Hi On Sonntag, 15. März 2009, Stefan Lippers-Hollmann wrote: > Hi > > On Mittwoch, 7. Januar 2009, Jiri Slaby wrote: > > On 01/07/2009 02:51 PM, Jiri Slaby wrote: > > > Dhaval Giani wrote: > > >> I see this on current git. Not sure how to reproduce it, has happened on > > >> two random occasions. At both times, I was not connected to a wireless > > >> network, but to wired networks. > > >> > > >> ------------[ cut here ]------------ > > >> WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0x7f/0x559 > > >> ... > > >> Call Trace: > > >> [<f80d4192>] __ieee80211_rx+0x7f/0x559 [mac80211] > > >> [<f80a19f4>] ath5k_tasklet_rx+0x4f7/0x53b [ath5k] > > >> ... > > > > > > Hmm, maybe ath5k is culprit. Could you apply the attached patch and > > > use the kernel till the problem appears again? > > It seems as if this problem wouldn't be restricted to ath5k, I just > triggered something very similar on b43 and 2.6.29-rc8-git1 (i386, hard > preemption): > > b43-phy0: Broadcom 4306 WLAN found (core revision 5) [...] > wlan1: no IPv6 routers present > b43-phy0 ERROR: PHY transmission error > b43-phy0 ERROR: PHY transmission error > > [ lots of these, likely to be caused by minstrel being a little too > optimistic about the possible wlan rates (it was more conservative in > 2.6.28 and didn't happen there); the distance between both stations is > on the upper end ] > > b43-phy0 ERROR: PHY transmission error > __ratelimit: 9 callbacks suppressed > b43-phy0 ERROR: PHY transmission error > b43-phy0 ERROR: PHY transmission error > ------------[ cut here ]------------ > WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0xa2/0x6a0 [mac80211]() > Hardware name: Amilo D-Series > Modules linked in: ppdev lp aes_i586 aes_generic ipv6 af_packet rfkill_input arc4 ecb b43 rfkill rng_core mac80211 cfg80211 led_class input_polldev ssb joydev pcmcia snd_via82xx gameport snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device i2c_viapro serio_raw snd i2c_core pcspkr psmouse evdev soundcore via686a via_agp shpchp yenta_socket rsrc_nonstatic pcmcia_core pci_hotplug rtc_cmos battery rtc_core rtc_lib parport_pc parport ac button ext3 jbd mbcache sg sr_mod cdrom sd_mod ata_generic pata_acpi pata_via uhci_hcd ehci_hcd floppy firewire_ohci libata tulip firewire_core crc_itu_t usbcore scsi_mod thermal processor fan > Pid: 0, comm: swapper Not tainted 2.6.29-rc8-sidux-686 #1 > Call Trace: > [<c01319d7>] warn_slowpath+0x87/0xe0 > [<d00523b7>] op32_set_current_rxslot+0x27/0x40 [b43] > [<d0052d93>] b43_dma_rx+0x193/0x420 [b43] > [<c0124fc3>] __wake_up_common+0x43/0x70 > [<cfffcc62>] __ieee80211_rx+0xa2/0x6a0 [mac80211] > [<c011e9a5>] default_spin_lock_flags+0x5/0x10 > [<c03a3f2e>] _spin_lock_irqsave+0x3e/0x60 > [<cffeb337>] ieee80211_tasklet_handler+0x107/0x130 [mac80211] > [<c013692c>] tasklet_action+0x6c/0xf0 > [<c0137147>] __do_softirq+0x87/0x140 > [<c011e9a5>] default_spin_lock_flags+0x5/0x10 > [<c03a3f2e>] _spin_lock_irqsave+0x3e/0x60 > [<c0137255>] do_softirq+0x55/0x60 > [<c0137495>] irq_exit+0x75/0x90 > [<c0106378>] do_IRQ+0x48/0x90 > [<c0104527>] common_interrupt+0x27/0x2c > [<cf8372e4>] acpi_idle_enter_simple+0x17a/0x1f4 [processor] > [<c02fd3bf>] cpuidle_idle_call+0x6f/0xc0 > [<c0102de6>] cpu_idle+0x66/0xa0 > ---[ end trace c754f566bbe5ac47 ]--- [...] > Sometimes even the firmware crashes and gets reloaded continously. [...] > Setting a fixed wlan rate (like 11M) seems to avoid this problem. > > > I don't think this will print anything, the rate won't be 32, it's rather > > too high. Could you apply also the appended debug one? > > I will apply this patch and give it some more testing tomorrow evening, > this problem is almost 100% reproducable for me at the end of my router's > range and doesn't happen in closer proximity. [...] Sorry for the late response, but I've been unexpectedly away from my BCM4306 system until today. Thanks to the following (not yet mainline) patches by Michael Buesch and Lorenzo Nava on top of 2.6.29-rc8-git5, these problems seem to be "fixed" (well, the PHY errors are basically just hidden, but as they don't trigger the firmware watchdog anymore, it's much less of a problem and isn't actually a user visible problem anymore). [PATCH] b43: Mask PHY TX error interrupt, if not debugging http://marc.info/?l=linux-wireless&m=123748731831778&w=2 [PATCH] b43: fix b43_plcp_get_bitrate_idx_ofdm return type http://marc.info/?l=linux-wireless&m=123774585529189&w=2 Confirming the patch descriptions, Jiri Slaby's debugging patch did reveal a signedness problem of the return value of in b43_plcp_get_bitrate_idx_ofdm(), which has been fixed by the patch above: [ this trace happened *without* "b43: fix b43_plcp_get_bitrate_idx_ofdm return type", and only "b43: Mask PHY TX error interrupt, if not debugging" applied on top of 2.6.29-rc8-git5 ] ------------[ cut here ]------------ WARNING: at net/mac80211/rx.c:2236 __ieee80211_rx+0xab/0x6b0 [mac80211]() Hardware name: Amilo D-Series RATE=255, BAND=c Modules linked in: ppdev lp aes_i586 aes_generic ipv6 af_packet rfkill_input arc4 ecb b43 rfkill rng_core mac80211 cfg80211 led_class input_polldev ssb joydev snd_via82xx gameport snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss pcmcia snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi i2c_viapro serio_raw snd_seq_device pcspkr i2c_core psmouse snd evdev soundcore via686a shpchp yenta_socket rsrc_nonstatic pcmcia_core via_agp pci_hotplug rtc_cmos parport_pc battery rtc_core rtc_lib parport ac button ext3 jbd mbcache sg sr_mod cdrom sd_mod ata_generic pata_acpi pata_via uhci_hcd ehci_hcd floppy firewire_ohci libata tulip firewire_core crc_itu_t usbcore scsi_mod thermal processor fan Pid: 0, comm: swapper Not tainted 2.6.29-rc8-sidux-686 #1 Call Trace: [<c0131a67>] warn_slowpath+0x87/0xe0 [<d002d377>] op32_set_current_rxslot+0x27/0x40 [b43] [<d002dd53>] b43_dma_rx+0x193/0x420 [b43] [<c01ae229>] add_partial+0x19/0x70 [<cfcd834f>] ieee80211_tasklet_handler+0x11f/0x130 [mac80211] [<c03a4195>] _spin_unlock+0x5/0x20 [<cfce9c6b>] __ieee80211_rx+0xab/0x6b0 [mac80211] [<c011ea35>] default_spin_lock_flags+0x5/0x10 [<c03a3d7e>] _spin_lock_irqsave+0x3e/0x60 [<cfcd8337>] ieee80211_tasklet_handler+0x107/0x130 [mac80211] [<c01369bc>] tasklet_action+0x6c/0xf0 [<c01371d7>] __do_softirq+0x87/0x140 [<c011ea35>] default_spin_lock_flags+0x5/0x10 [<c03a3d7e>] _spin_lock_irqsave+0x3e/0x60 [<c01372e5>] do_softirq+0x55/0x60 [<c0137525>] irq_exit+0x75/0x90 [<c0106378>] do_IRQ+0x48/0x90 [<c0104527>] common_interrupt+0x27/0x2c [<cf8372cb>] acpi_idle_enter_simple+0x17a/0x1f4 [processor] [<c02fcfcf>] cpuidle_idle_call+0x6f/0xc0 [<c0102de6>] cpu_idle+0x66/0xa0 ---[ end trace ba8601a4d52a20d2 ]--- ------------[ cut here ]------------ So far (after 2.9 GB continuous kernel tarball downloads from a local mirror) b43 seems to be fine again: wlan1 IEEE 802.11bg ESSID:"gemini" Mode:Managed Frequency:2.412 GHz Access Point: 00:21:27:FF:51:A8 Bit Rate=54 Mb/s Tx-Power=20 dBm Retry min limit:7 RTS thr:off Fragment thr=2352 B Power Management:off Link Quality=54/100 Signal level:-82 dBm Noise level=-69 dBm Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0 Tx excessive retries:0 Invalid misc:0 Missed beacon:0 wlan1 Link encap:Ethernet HWaddr 00:0f:66:d8:67:ca inet addr:192.168.0.70 Bcast:192.168.0.255 Mask:255.255.255.0 inet6 addr: fe80::20f:66ff:fed8:67ca/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:2090104 errors:0 dropped:0 overruns:0 frame:0 TX packets:1082081 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:3146865411 (2.9 GiB) TX bytes:93054386 (88.7 MiB) Fetched 83.2MB in 1min18s (1058kB/s) [...] Fetched 83.2MB in 1min1s (1362kB/s) Thank you and sorry about the late response. Regards Stefan Lippers-Hollmann Post scriptum: I'm not able to trigger this trace with ath5k/ AR2425.
On Mon, Mar 23, 2009 at 01:45:58AM +0100, Stefan Lippers-Hollmann wrote: > > Post scriptum: I'm not able to trigger this trace with ath5k/ AR2425. Okay, well just to be clear ath5k had the same issue (I posted a patch a couple of weeks ago - I think it got lost and I need to repost it). But this is separate from the problem where the rate controller is choosing a bad rate index for TX in adhoc mode, that's still an unknown, unsolved problem.
On Sun, Mar 01, 2009 at 12:08:07AM +0100, Jiri Slaby wrote: > On 15.2.2009 14:47, Bob Copeland wrote: >> On Mon, Feb 02, 2009 at 01:27:39PM +0530, Dhaval Giani wrote: >>> So I finally managed to hit this on 2.6.29-rc3. It is hard to >>> reproduce, so I hope so much information is enough to give you a good >>> guess. This time it hit while trying to connect to an open network at >>> the airport. >> >>> WARNING: at net/mac80211/rx.c:2236 __ieee80211_rx+0x96/0x571 [mac80211]() >>> Hardware name: 2007CS3 >>> RATE=255, BAND=8 >> >> band is supposed to be sc->curband? 8 is way wrong. > > If you look into the patch which outputs this (backtrace in this > thread), sband->n_bitrates is 8. I have no idea what I have been smoking > the day I wrote it, but BAND= for sure isn't the right name for that > thing. Sorry for the confusion. > >> rate could be 255 >> if, for some reason, the hardware rate wasn't in the rate table. > > So, we have a fix for this, right? I mean the u8->s8 sc->rate_idx > conversion or alike... Where is the fix? Is it merged in? I still see this happen on 2.6.29 thanks,
On Mon, Mar 30, 2009 at 4:59 AM, Dhaval Giani <dhaval@linux.vnet.ibm.com> wrote: > Where is the fix? Is it merged in? I still see this happen on 2.6.29 > > thanks, It's in b726604706ad88d8b28bc487e45e710f58cc19ee in Linus' tree, after 2.6.29. You still might get a warning, but this time from the driver side instead of higher up the stack -- if you do please post it.
On Mon, Mar 30, 2009 at 12:58:28PM -0400, Bob Copeland wrote: > On Mon, Mar 30, 2009 at 4:59 AM, Dhaval Giani <dhaval@linux.vnet.ibm.com> wrote: > > Where is the fix? Is it merged in? I still see this happen on 2.6.29 > > > > thanks, > > It's in b726604706ad88d8b28bc487e45e710f58cc19ee in Linus' tree, after > 2.6.29. You still might get a warning, but this time from the driver > side instead of higher up the stack -- if you do please post it. > ok, so my kernel does hve this patch applied, and this is what I get, ------------[ cut here ]------------ WARNING: at include/net/mac80211.h:1956 minstrel_get_rate+0xa1/0x4b9 [mac80211]() Hardware name: 2007CS3 Modules linked in: fuse radeon drm ipt_MASQUERADE iptable_nat nf_nat bridge stp bnep sco l2cap bluetooth ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 cpufreq_ondemand acpi_cpufreq dm_multipath kvm_intel kvm uinput snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep snd_seq_dummy arc4 snd_seq_oss snd_seq_midi_event snd_seq ecb ath5k nsc_ircc snd_seq_device snd_pcm_oss video snd_mixer_oss snd_pcm mac80211 snd_timer snd yenta_socket i2c_i801 thinkpad_acpi rfkill irda output iTCO_wdt rsrc_nonstatic pcspkr hwmon cfg80211 joydev i2c_core iTCO_vendor_support soundcore crc_ccitt snd_page_alloc [last unloaded: scsi_wait_scan] Pid: 2389, comm: wpa_supplicant Tainted: G W 2.6.29-tip #28 Call Trace: [<c0431b0e>] warn_slowpath+0x76/0xad [<c04523e1>] ? print_lock_contention_bug+0x14/0xd7 [<c042e874>] ? default_wake_function+0x10/0x12 [<c04523e1>] ? print_lock_contention_bug+0x14/0xd7 [<f7d31879>] minstrel_get_rate+0xa1/0x4b9 [mac80211] [<c0450fa4>] ? trace_hardirqs_on+0xb/0xd [<c0424909>] ? __wake_up+0x36/0x40 [<f7d272fe>] ? invoke_tx_handlers+0x3b1/0xa50 [mac80211] [<f7d21b1e>] rate_control_get_rate+0x7e/0xbe [mac80211] [<f7d27330>] invoke_tx_handlers+0x3e3/0xa50 [mac80211] [<c0450e61>] ? trace_hardirqs_on_caller+0x18/0x150 [<f7d26c03>] ? __ieee80211_tx_prepare+0x24b/0x288 [mac80211] [<f7d286ad>] ieee80211_master_start_xmit+0x38b/0x4b2 [mac80211] [<c069d1f4>] dev_hard_start_xmit+0x219/0x280 [<c06ac17e>] __qdisc_run+0xca/0x1b0 [<c069d6de>] dev_queue_xmit+0x398/0x4bf [<f7d2a116>] ieee80211_tx_skb+0x53/0x56 [mac80211] [<f7d1dac4>] ieee80211_send_deauth_disassoc+0xd7/0xdf [mac80211] [<f7d1dbc1>] ieee80211_set_disassoc+0xf5/0x209 [mac80211] [<f7d1ddc6>] ieee80211_sta_req_auth+0x47/0x69 [mac80211] [<f7d17c5a>] ieee80211_ioctl_siwgenie+0x50/0x5d [mac80211] [<c06f9720>] ioctl_standard_call+0x1b4/0x268 [<c069b3ce>] ? dev_name_hash+0x1b/0x47 [<c06f92e7>] wext_handle_ioctl+0xe7/0x17d [<f7d17c0a>] ? ieee80211_ioctl_siwgenie+0x0/0x5d [mac80211] [<c04937ba>] ? might_fault+0x83/0x85 [<c069f06f>] dev_ioctl+0x5c6/0x5e6 [<c0690bf3>] ? sockfd_lookup_light+0x1b/0x4e [<c0691b65>] ? sys_sendto+0xa9/0xc8 [<c04cf997>] ? dnotify_parent+0x22/0x63 [<c0690746>] ? sock_ioctl+0x0/0x1f0 [<c069092a>] sock_ioctl+0x1e4/0x1f0 [<c0690746>] ? sock_ioctl+0x0/0x1f0 [<c04b6d55>] vfs_ioctl+0x27/0x6e [<c04b72d4>] do_vfs_ioctl+0x46f/0x4a8 [<c0691ba1>] ? sys_send+0x1d/0x1f [<c04b7352>] sys_ioctl+0x45/0x5f [<c04032a4>] sysenter_do_call+0x12/0x38 ---[ end trace 0e3d1a2e9037b74b ]--- > -- > Bob Copeland %% www.bobcopeland.com
On Mon, Mar 30, 2009 at 1:59 PM, Dhaval Giani <dhaval@linux.vnet.ibm.com> wrote: > ok, so my kernel does hve this patch applied, and this is what I get, > > ------------[ cut here ]------------ > WARNING: at include/net/mac80211.h:1956 minstrel_get_rate+0xa1/0x4b9 [mac80211]() I believe this is something different (tx path not rx). I think it's that minstrel rate table bug again, which we never solved for ath5k. Are you using adhoc or managed mode? Do you have the slab/slub debugging options turned on? Any steps that consistently reproduce it? Do you get any warnings with PID controller?
On Mon, Mar 30, 2009 at 02:13:35PM -0400, Bob Copeland wrote: > On Mon, Mar 30, 2009 at 1:59 PM, Dhaval Giani <dhaval@linux.vnet.ibm.com> wrote: > > ok, so my kernel does hve this patch applied, and this is what I get, > > > > ------------[ cut here ]------------ > > WARNING: at include/net/mac80211.h:1956 minstrel_get_rate+0xa1/0x4b9 [mac80211]() > > I believe this is something different (tx path not rx). I think it's > that minstrel rate table bug again, which we never solved for ath5k. > > Are you using adhoc or managed mode? Do you have the slab/slub debugging > options turned on? Any steps that consistently reproduce it? Do you > get any warnings with PID controller? > [dhaval@gondor ~]$ iwconfig wlan0 wlan0 IEEE 802.11abg ESSID:"linksys_SES_62338" Mode:Managed Frequency:2.462 GHz Access Point: 00:1A:70:D6:2D:06 Bit Rate=36 Mb/s Tx-Power=23 dBm Retry min limit:7 RTS thr:off Fragment thr=2352 B Power Management:off Link Quality=100/100 Signal level:-49 dBm Noise level=-96 dBm Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0 Tx excessive retries:0 Invalid misc:0 Missed beacon:0 [dhaval@gondor ~]$ [dhaval@gondor linux-2.6]$ grep -i slub .config CONFIG_SLUB_DEBUG=y CONFIG_SLUB=y # CONFIG_SLUB_DEBUG_ON is not set # CONFIG_SLUB_STATS is not set [dhaval@gondor linux-2.6]$ Am not sure what the PID controller is, and google gave me a number of results, which did not make too much sense in the context. Yes, I think I know how to reproduce it, but I am not sure what is the real cause. One way I have found of reproducing it is to connect to open networks, but it does not happen always. At home, when my network is set to open, I do not see this issue, whereas at the airport, kaboom. I've also seen it on LEAP networks, but there were also a few open networks around. This warning is generally accompanied by a disconnect from the LEAP connected network, and then the system reconnects. Let me know if you have patches, I can give them a run and report back. Thanks,
On Tue, Mar 31, 2009 at 09:21:40AM +0530, Dhaval Giani wrote: > Am not sure what the PID controller is, and google gave me a number of > results, which did not make too much sense in the context. CONFIG_MAC80211_RC_PID -- unfortunately I recall having to jump through a few config hoops to enable it. > One way I have found of reproducing it is to connect to open networks, > but it does not happen always. At home, when my network is set to open, > I do not see this issue, whereas at the airport, kaboom. Ok - that is a useful data point. Perhaps something to do with the rates the peer supports; it would help if you could grab a scan next time you are in the area. Turn off auto-connect to open networks, then do: # iw dev wlan0 scan trigger # iw dev wlan0 scan dump >> dump.out # do this a few times Then if a particular peer triggers the problem, we can look at the advertised rates to see if anything jumps out.
On Tue, Mar 31, 2009 at 8:23 AM, Bob Copeland <me@bobcopeland.com> wrote: > Ok - that is a useful data point. Perhaps something to do with the rates > the peer supports; it would help if you could grab a scan next time you > are in the area. Turn off auto-connect to open networks, then do: Hi Dhaval, Would you mind trying this patch and report the warnings it triggers? http://marc.info/?l=linux-kernel&m=123915183521347&q=raw
Patch
diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c index 7175ae8..5e17e57 100644 --- a/net/mac80211/rx.c +++ b/net/mac80211/rx.c @@ -2230,8 +2230,10 @@ void __ieee80211_rx(struct ieee80211_hw *hw, struct sk_buff *skb, * MCS aware. */ rate = &sband->bitrates[sband->n_bitrates - 1]; } else { - if (WARN_ON(status->rate_idx < 0 || - status->rate_idx >= sband->n_bitrates)) + if (WARN(status->rate_idx < 0 || + status->rate_idx >= sband->n_bitrates, + "RATE=%u, BAND=%x\n", status->rate_idx, + sband->n_bitrates)) return; rate = &sband->bitrates[status->rate_idx]; }