Message ID | 20110615220252.1918.73638.stgit@mj.roinet.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On 06/16/2011 12:02 AM, Pavel Roskin wrote: > WARN_ON is not enough, as we cannot return a valid index, and the > callers will use whatever we return, causing a cascade of oopses and > eventually a panic. > > Signed-off-by: Pavel Roskin<proski@gnu.org> > --- > 0 files changed, 0 insertions(+), 0 deletions(-) > > diff --git a/include/net/mac80211.h b/include/net/mac80211.h > index e33fe79..d117019 100644 > --- a/include/net/mac80211.h > +++ b/include/net/mac80211.h > @@ -3108,10 +3108,10 @@ rate_lowest_index(struct ieee80211_supported_band *sband, > if (rate_supported(sta, sband->band, i)) > return i; > > - /* warn when we cannot find a rate. */ > - WARN_ON(1); > + /* If we cannot find any rate, we are in trouble. */ > + BUG_ON(1); > > - return 0; > + return -EINVAL; > } I would expect some description what the caller should do when -EINVAL is returned. Could even argue whether you want a BUG_ON or allow the caller (driver) to reset the hardware, reassociate, etc. upon -EINVAL. I am not sure under which circumstances this could happen so there may really be no way out. Gr. AvS
Hi Pavel On Wed, Jun 15, 2011 at 06:02:52PM -0400, Pavel Roskin wrote: > WARN_ON is not enough, as we cannot return a valid index, and the > callers will use whatever we return, causing a cascade of oopses and > eventually a panic. We have fedora bug (https://bugzilla.redhat.com/show_bug.cgi?id=702627) where only that warning is generated, and system works further (at least bug reporter did not mention about it's hang). When moving to BUG(), system from user perspective will simply hang, what is much worse. I think, we should rather fix callers to be prepared and recover itself when rate_lowest_index fail. Of course fixing real bug(s) that cause rate index is not found would be best. Stanislaw -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/16/2011 05:08 AM, Arend van Spriel wrote: > On 06/16/2011 12:02 AM, Pavel Roskin wrote: >> - WARN_ON(1); >> + /* If we cannot find any rate, we are in trouble. */ >> + BUG_ON(1); >> >> - return 0; >> + return -EINVAL; >> } > > I would expect some description what the caller should do when -EINVAL > is returned. Could even argue whether you want a BUG_ON or allow the > caller (driver) to reset the hardware, reassociate, etc. upon -EINVAL. I > am not sure under which circumstances this could happen so there may > really be no way out. I'm proposing a simple change that replaces memory corruption and unpredictable behavior with a predictable stop. If sysrq is enabled, the system can even be restarted by the keyboard. The meaning of -EINVAL is symbolic. It shows that we are not returning a fallback value. There is no valid fallback value to return. The callers should not try to initialize rate control on interfaces with no rates enabled. Anyway, the fix is out of scope of my patch.
On 06/16/2011 06:22 AM, Stanislaw Gruszka wrote: > Hi Pavel > > On Wed, Jun 15, 2011 at 06:02:52PM -0400, Pavel Roskin wrote: >> WARN_ON is not enough, as we cannot return a valid index, and the >> callers will use whatever we return, causing a cascade of oopses and >> eventually a panic. > > We have fedora bug (https://bugzilla.redhat.com/show_bug.cgi?id=702627) > where only that warning is generated, and system works further (at least > bug reporter did not mention about it's hang). When moving to BUG(), > system from user perspective will simply hang, what is much worse. > I think, we should rather fix callers to be prepared and recover itself > when rate_lowest_index fail. Of course fixing real bug(s) that cause > rate index is not found would be best. In my case, I was able to restart the system by sysrq when using BUG_ON, but the system would hang hard with WARN_ON. I think continuing after returning an invalid index is wrong. It will lead to memory corruption that could in turn lead to corruption of the filesystem.
On 06/17/2011 12:08 AM, Pavel Roskin wrote: > On 06/16/2011 06:22 AM, Stanislaw Gruszka wrote: >> Hi Pavel >> >> On Wed, Jun 15, 2011 at 06:02:52PM -0400, Pavel Roskin wrote: >>> WARN_ON is not enough, as we cannot return a valid index, and the >>> callers will use whatever we return, causing a cascade of oopses and >>> eventually a panic. >> We have fedora bug (https://bugzilla.redhat.com/show_bug.cgi?id=702627) >> where only that warning is generated, and system works further (at least >> bug reporter did not mention about it's hang). When moving to BUG(), >> system from user perspective will simply hang, what is much worse. >> I think, we should rather fix callers to be prepared and recover itself >> when rate_lowest_index fail. Of course fixing real bug(s) that cause >> rate index is not found would be best. > In my case, I was able to restart the system by sysrq when using BUG_ON, > but the system would hang hard with WARN_ON. I think continuing after > returning an invalid index is wrong. It will lead to memory corruption > that could in turn lead to corruption of the filesystem. I think it is generally a bad idea to use BUG_ON as a solution. As Stanislaw indicated there are platforms which can continue without a hang so you are regressing those. Regarding your patch I think the -EINVAL is good to have, but leave the WARN_ON. Gr. AvS
On 06/17/2011 02:37 AM, Arend van Spriel wrote: > On 06/17/2011 12:08 AM, Pavel Roskin wrote: >> In my case, I was able to restart the system by sysrq when using BUG_ON, >> but the system would hang hard with WARN_ON. I think continuing after >> returning an invalid index is wrong. It will lead to memory corruption >> that could in turn lead to corruption of the filesystem. > > I think it is generally a bad idea to use BUG_ON as a solution. As > Stanislaw indicated there are platforms which can continue without a > hang so you are regressing those. Regarding your patch I think the > -EINVAL is good to have, but leave the WARN_ON. Well, then it would be pointless. The whole idea was to prevent memory corruption and to stop early. Returning -EINVAL for real (as opposed to having it in the dead code) could lead to worse memory corruption than returning 0. The real fix would be to make rate control algorithms robust to the case when no rates are allowed. My patch was meant as a first little step towards it by making it easier to capture the backtrace and restart the system. But it's not worth the trouble for me to argue about it.
On Wed, 2011-06-15 at 18:02 -0400, Pavel Roskin wrote: > WARN_ON is not enough, as we cannot return a valid index, and the > callers will use whatever we return, causing a cascade of oopses and > eventually a panic. Can you show the details of that? 0 should always be an at least semi-valid index pointing to _something_ whereas -EINVAL will cause it to point to random memory that isn't even valid... johannes -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/net/mac80211.h b/include/net/mac80211.h index e33fe79..d117019 100644 --- a/include/net/mac80211.h +++ b/include/net/mac80211.h @@ -3108,10 +3108,10 @@ rate_lowest_index(struct ieee80211_supported_band *sband, if (rate_supported(sta, sband->band, i)) return i; - /* warn when we cannot find a rate. */ - WARN_ON(1); + /* If we cannot find any rate, we are in trouble. */ + BUG_ON(1); - return 0; + return -EINVAL; } static inline
WARN_ON is not enough, as we cannot return a valid index, and the callers will use whatever we return, causing a cascade of oopses and eventually a panic. Signed-off-by: Pavel Roskin <proski@gnu.org> --- 0 files changed, 0 insertions(+), 0 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html