Message ID | 20110607145435.GA5179@redhat.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On Tue, 2011-06-07 at 16:54 +0200, Stanislaw Gruszka wrote: > That could be useful hint, we do not scan chan by chan, but we > have thing called "plcp check health", which "restart radio" > by requesting one channel scan. So perhaps disabling that could > help. At this moment I'm interested in something (a script, some sequence of actions, whatever) that (somewhat) reliably triggers this error. Because right now I have no clue what triggers it. Is your patch in that category or is it a (crude) fix? If it's a fix, I'm not sure it is of much help at this stage. Paul Bolle -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jun 07, 2011 at 09:23:00PM +0200, Paul Bolle wrote: > On Tue, 2011-06-07 at 16:54 +0200, Stanislaw Gruszka wrote: > > That could be useful hint, we do not scan chan by chan, but we > > have thing called "plcp check health", which "restart radio" > > by requesting one channel scan. So perhaps disabling that could > > help. > > At this moment I'm interested in something (a script, some sequence of > actions, whatever) that (somewhat) reliably triggers this error. Because > right now I have no clue what triggers it. Having reliable reproducer will be definitely something that is nice to have. But bug could be some kind of race condition that happen in code flow once per 10000000000 cases ... > Is your patch in that category or is it a (crude) fix? If it's a fix, > I'm not sure it is of much help at this stage. It could be possible fix. Why you can not simply patch and see if errors are still there? If after a week or so there will be no errors, we could consider bug fixed, otherwise well ... still will need looking around for fix. I just posted patch that remove these "plcp health check" and related code on -next anyway, because I don't think this is something that we need. Stanislaw -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2011-06-08 at 15:47 +0200, Stanislaw Gruszka wrote: > It could be possible fix. Why you can not simply patch and see if errors > are still there? If after a week or so there will be no errors, we could > consider bug fixed, otherwise well ... still will need looking around > for fix. > > I just posted patch that remove these "plcp health check" and related > code on -next anyway, because I don't think this is something that we > need. 0) This is just to note that I haven't yet tried to see if you're small patch helps. I still hope to do that as I have not given up on this issue. Feel free to prod me if I again disappear for too long and you loose your patience. 1) By the way, I still see this error (every now and then) in my logs. Most recently while running v3.0.1, so it appears not to be fixed by recent updates for iwl4965 (if any). Paul Bolle -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2011-06-07 at 16:54 +0200, Stanislaw Gruszka wrote: > That could be useful hint, we do not scan chan by chan, but we > have thing called "plcp check health", which "restart radio" > by requesting one channel scan. So perhaps disabling that could > help. > > [...] > > diff --git a/drivers/net/wireless/iwlegacy/iwl-rx.c b/drivers/net/wireless/iwlegacy/iwl-rx.c > index 654cf23..6062da0 100644 > --- a/drivers/net/wireless/iwlegacy/iwl-rx.c > +++ b/drivers/net/wireless/iwlegacy/iwl-rx.c > @@ -230,6 +230,8 @@ EXPORT_SYMBOL(iwl_legacy_rx_spectrum_measure_notif); > void iwl_legacy_recover_from_statistics(struct iwl_priv *priv, > struct iwl_rx_packet *pkt) > { > + return; > + > if (test_bit(STATUS_EXIT_PENDING, &priv->status)) > return; > if (iwl_legacy_is_any_associated(priv)) { 0) I finally got around to applying this patch (to v3.0.4). 1) After a few days of normal usage (with quite a bit of suspend and resume cycles) this error was again triggered. So avoiding check_plcp_health() doesn't seem to help. 2) I never send you the debug output (ie, output after doing "modprobe iwl4965 debug=0x47ffffff"), did I? Paul Bolle -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Sep 04, 2011 at 10:28:35AM +0200, Paul Bolle wrote: > On Tue, 2011-06-07 at 16:54 +0200, Stanislaw Gruszka wrote: > > That could be useful hint, we do not scan chan by chan, but we > > have thing called "plcp check health", which "restart radio" > > by requesting one channel scan. So perhaps disabling that could > > help. > > > > [...] > > > > diff --git a/drivers/net/wireless/iwlegacy/iwl-rx.c b/drivers/net/wireless/iwlegacy/iwl-rx.c > > index 654cf23..6062da0 100644 > > --- a/drivers/net/wireless/iwlegacy/iwl-rx.c > > +++ b/drivers/net/wireless/iwlegacy/iwl-rx.c > > @@ -230,6 +230,8 @@ EXPORT_SYMBOL(iwl_legacy_rx_spectrum_measure_notif); > > void iwl_legacy_recover_from_statistics(struct iwl_priv *priv, > > struct iwl_rx_packet *pkt) > > { > > + return; > > + > > if (test_bit(STATUS_EXIT_PENDING, &priv->status)) > > return; > > if (iwl_legacy_is_any_associated(priv)) { > > 0) I finally got around to applying this patch (to v3.0.4). > > 1) After a few days of normal usage (with quite a bit of suspend and > resume cycles) this error was again triggered. So avoiding > check_plcp_health() doesn't seem to help. > > 2) I never send you the debug output (ie, output after doing "modprobe > iwl4965 debug=0x47ffffff"), did I? No, but if error show up after few days, gathering and analyzing few days of debug logs in impractical. Does wifi stop working after an error, or there is some other negative impact? Or only that messages are printed and driver recover itself? Stanislaw -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2011-09-05 at 11:33 +0200, Stanislaw Gruszka wrote: > On Sun, Sep 04, 2011 at 10:28:35AM +0200, Paul Bolle wrote: > > 1) After a few days of normal usage (with quite a bit of suspend and > > resume cycles) this error was again triggered. So avoiding > > check_plcp_health() doesn't seem to help. > > > > 2) I never send you the debug output (ie, output after doing "modprobe > > iwl4965 debug=0x47ffffff"), did I? > > No, but if error show up after few days, gathering and analyzing few > days of debug logs in impractical. I see. > Does wifi stop working after an > error, or there is some other negative impact? Or only that messages > are printed and driver recover itself? There doesn't seem to be any impact (ie, it might have some impact but I'm too insensitive to notice). The driver does recover itself and I do not have to mess with rfkill or "modprobe -r" or whatever. I actually discovered this because I tend to regularly do dmesg -r | grep "^<[123]>" to keep myself informed of any kernel errors (or worse). And then these few dozen lines can't go unnoticed. Paul Bolle -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2011-09-05 at 12:32 +0200, Paul Bolle wrote: > On Mon, 2011-09-05 at 11:33 +0200, Stanislaw Gruszka wrote: > > Does wifi stop working after an > > error, or there is some other negative impact? Or only that messages > > are printed and driver recover itself? > > There doesn't seem to be any impact (ie, it might have some impact but > I'm too insensitive to notice). The driver does recover itself and I do > not have to mess with rfkill or "modprobe -r" or whatever. I actually > discovered this because I tend to regularly do > dmesg -r | grep "^<[123]>" > > to keep myself informed of any kernel errors (or worse). And then these > few dozen lines can't go unnoticed. 0) It's one year later now and this Microcode SW error again showed up in the logs. I recently upgraded and I haven't kept any logs, but my guess would be that I have run into that error once every week. (This laptop is now running a v3.5.3 based kernel as shipped for Fedora 17.) 1) Would you have any suggestions how to pinpoint the cause of this error? It is mainly annoying, and I managed to ignore it since my previous message, but I still would like to free the logs from the noise it makes. Paul Bolle -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2012-09-14 at 14:17 +0200, Paul Bolle wrote: > 0) It's one year later now and this Microcode SW error again showed up > in the logs. I recently upgraded and I haven't kept any logs, but my > guess would be that I have run into that error once every week. (This > laptop is now running a v3.5.3 based kernel as shipped for Fedora 17.) > > 1) Would you have any suggestions how to pinpoint the cause of this > error? It is mainly annoying, and I managed to ignore it since my > previous message, but I still would like to free the logs from the noise > it makes. 0) I ported the "iwlegacy_tracing" patch from https://bugzilla.kernel.org/show_bug.cgi?id=42766 to v3.6-rc7 and to iwl4965. I've been running iwl4965 with tracing enabled ever since (that is on: v3.6-rc7, v3.6, v3.6.1, and v3.6.2). Finally, after only three weeks I hit our Microcode SW error again. 1) So now I've got a 600+k line (or 65 MB) trace dump. What should I do with it? Paul Bolle -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Oct 15, 2012 at 04:51:00PM +0200, Paul Bolle wrote: > On Fri, 2012-09-14 at 14:17 +0200, Paul Bolle wrote: > > 0) It's one year later now and this Microcode SW error again showed up > > in the logs. I recently upgraded and I haven't kept any logs, but my > > guess would be that I have run into that error once every week. (This > > laptop is now running a v3.5.3 based kernel as shipped for Fedora 17.) > > > > 1) Would you have any suggestions how to pinpoint the cause of this > > error? It is mainly annoying, and I managed to ignore it since my > > previous message, but I still would like to free the logs from the noise > > it makes. > > 0) I ported the "iwlegacy_tracing" patch from > https://bugzilla.kernel.org/show_bug.cgi?id=42766 to v3.6-rc7 and to > iwl4965. I've been running iwl4965 with tracing enabled ever since (that > is on: v3.6-rc7, v3.6, v3.6.1, and v3.6.2). Finally, after only three > weeks I hit our Microcode SW error again. > > 1) So now I've got a 600+k line (or 65 MB) trace dump. What should I do > with it? Just post me privately let say last 10MB of it... Stanislaw -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/wireless/iwlegacy/iwl-rx.c b/drivers/net/wireless/iwlegacy/iwl-rx.c index 654cf23..6062da0 100644 --- a/drivers/net/wireless/iwlegacy/iwl-rx.c +++ b/drivers/net/wireless/iwlegacy/iwl-rx.c @@ -230,6 +230,8 @@ EXPORT_SYMBOL(iwl_legacy_rx_spectrum_measure_notif); void iwl_legacy_recover_from_statistics(struct iwl_priv *priv, struct iwl_rx_packet *pkt) { + return; + if (test_bit(STATUS_EXIT_PENDING, &priv->status)) return; if (iwl_legacy_is_any_associated(priv)) {