diff mbox

mac80211: use BUG_ON and return -EINVAL if rate_lowest_index() fails

Message ID 20110615220252.1918.73638.stgit@mj.roinet.com (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Pavel Roskin June 15, 2011, 10:02 p.m. UTC
WARN_ON is not enough, as we cannot return a valid index, and the
callers will use whatever we return, causing a cascade of oopses and
eventually a panic.

Signed-off-by: Pavel Roskin <proski@gnu.org>
---
 0 files changed, 0 insertions(+), 0 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Arend van Spriel June 16, 2011, 9:08 a.m. UTC | #1
On 06/16/2011 12:02 AM, Pavel Roskin wrote:
> WARN_ON is not enough, as we cannot return a valid index, and the
> callers will use whatever we return, causing a cascade of oopses and
> eventually a panic.
>
> Signed-off-by: Pavel Roskin<proski@gnu.org>
> ---
>   0 files changed, 0 insertions(+), 0 deletions(-)
>
> diff --git a/include/net/mac80211.h b/include/net/mac80211.h
> index e33fe79..d117019 100644
> --- a/include/net/mac80211.h
> +++ b/include/net/mac80211.h
> @@ -3108,10 +3108,10 @@ rate_lowest_index(struct ieee80211_supported_band *sband,
>   		if (rate_supported(sta, sband->band, i))
>   			return i;
>
> -	/* warn when we cannot find a rate. */
> -	WARN_ON(1);
> +	/* If we cannot find any rate, we are in trouble. */
> +	BUG_ON(1);
>
> -	return 0;
> +	return -EINVAL;
>   }

I would expect some description what the caller should do when -EINVAL 
is returned. Could even argue whether you want a BUG_ON or allow the 
caller (driver) to reset the hardware, reassociate, etc. upon -EINVAL. I 
am not sure under which circumstances this could happen so there may 
really be no way out.

Gr. AvS
Stanislaw Gruszka June 16, 2011, 10:22 a.m. UTC | #2
Hi Pavel

On Wed, Jun 15, 2011 at 06:02:52PM -0400, Pavel Roskin wrote:
> WARN_ON is not enough, as we cannot return a valid index, and the
> callers will use whatever we return, causing a cascade of oopses and
> eventually a panic.

We have fedora bug (https://bugzilla.redhat.com/show_bug.cgi?id=702627) 
where only that warning is generated, and system works further (at least
bug reporter did not mention about it's hang). When moving to BUG(),
system from user perspective will simply hang, what is much worse.
I think, we should rather fix callers to be prepared and recover itself
when rate_lowest_index fail. Of course fixing real bug(s) that cause
rate index is not found would be best.
 
Stanislaw
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pavel Roskin June 16, 2011, 10:06 p.m. UTC | #3
On 06/16/2011 05:08 AM, Arend van Spriel wrote:
> On 06/16/2011 12:02 AM, Pavel Roskin wrote:

>> - WARN_ON(1);
>> + /* If we cannot find any rate, we are in trouble. */
>> + BUG_ON(1);
>>
>> - return 0;
>> + return -EINVAL;
>> }
>
> I would expect some description what the caller should do when -EINVAL
> is returned. Could even argue whether you want a BUG_ON or allow the
> caller (driver) to reset the hardware, reassociate, etc. upon -EINVAL. I
> am not sure under which circumstances this could happen so there may
> really be no way out.

I'm proposing a simple change that replaces memory corruption and 
unpredictable behavior with a predictable stop.  If sysrq is enabled, 
the system can even be restarted by the keyboard.

The meaning of -EINVAL is symbolic.  It shows that we are not returning 
a fallback value.  There is no valid fallback value to return.

The callers should not try to initialize rate control on interfaces with 
no rates enabled.  Anyway, the fix is out of scope of my patch.
Pavel Roskin June 16, 2011, 10:08 p.m. UTC | #4
On 06/16/2011 06:22 AM, Stanislaw Gruszka wrote:
> Hi Pavel
>
> On Wed, Jun 15, 2011 at 06:02:52PM -0400, Pavel Roskin wrote:
>> WARN_ON is not enough, as we cannot return a valid index, and the
>> callers will use whatever we return, causing a cascade of oopses and
>> eventually a panic.
>
> We have fedora bug (https://bugzilla.redhat.com/show_bug.cgi?id=702627)
> where only that warning is generated, and system works further (at least
> bug reporter did not mention about it's hang). When moving to BUG(),
> system from user perspective will simply hang, what is much worse.
> I think, we should rather fix callers to be prepared and recover itself
> when rate_lowest_index fail. Of course fixing real bug(s) that cause
> rate index is not found would be best.

In my case, I was able to restart the system by sysrq when using BUG_ON, 
but the system would hang hard with WARN_ON.  I think continuing after 
returning an invalid index is wrong.  It will lead to memory corruption 
that could in turn lead to corruption of the filesystem.
Arend van Spriel June 17, 2011, 6:37 a.m. UTC | #5
On 06/17/2011 12:08 AM, Pavel Roskin wrote:
> On 06/16/2011 06:22 AM, Stanislaw Gruszka wrote:
>> Hi Pavel
>>
>> On Wed, Jun 15, 2011 at 06:02:52PM -0400, Pavel Roskin wrote:
>>> WARN_ON is not enough, as we cannot return a valid index, and the
>>> callers will use whatever we return, causing a cascade of oopses and
>>> eventually a panic.
>> We have fedora bug (https://bugzilla.redhat.com/show_bug.cgi?id=702627)
>> where only that warning is generated, and system works further (at least
>> bug reporter did not mention about it's hang). When moving to BUG(),
>> system from user perspective will simply hang, what is much worse.
>> I think, we should rather fix callers to be prepared and recover itself
>> when rate_lowest_index fail. Of course fixing real bug(s) that cause
>> rate index is not found would be best.
> In my case, I was able to restart the system by sysrq when using BUG_ON,
> but the system would hang hard with WARN_ON.  I think continuing after
> returning an invalid index is wrong.  It will lead to memory corruption
> that could in turn lead to corruption of the filesystem.

I think it is generally a bad idea to use BUG_ON as a solution. As 
Stanislaw indicated there are platforms which can continue without a 
hang so you are regressing those. Regarding your patch I think the 
-EINVAL is good to have, but leave the WARN_ON.

Gr. AvS
Pavel Roskin June 17, 2011, 8:33 p.m. UTC | #6
On 06/17/2011 02:37 AM, Arend van Spriel wrote:
> On 06/17/2011 12:08 AM, Pavel Roskin wrote:

>> In my case, I was able to restart the system by sysrq when using BUG_ON,
>> but the system would hang hard with WARN_ON. I think continuing after
>> returning an invalid index is wrong. It will lead to memory corruption
>> that could in turn lead to corruption of the filesystem.
>
> I think it is generally a bad idea to use BUG_ON as a solution. As
> Stanislaw indicated there are platforms which can continue without a
> hang so you are regressing those. Regarding your patch I think the
> -EINVAL is good to have, but leave the WARN_ON.

Well, then it would be pointless.  The whole idea was to prevent memory 
corruption and to stop early.  Returning -EINVAL for real (as opposed to 
having it in the dead code) could lead to worse memory corruption than 
returning 0.

The real fix would be to make rate control algorithms robust to the case 
when no rates are allowed.  My patch was meant as a first little step 
towards it by making it easier to capture the backtrace and restart the 
system.  But it's not worth the trouble for me to argue about it.
Johannes Berg June 19, 2011, 8:05 a.m. UTC | #7
On Wed, 2011-06-15 at 18:02 -0400, Pavel Roskin wrote:
> WARN_ON is not enough, as we cannot return a valid index, and the
> callers will use whatever we return, causing a cascade of oopses and
> eventually a panic.

Can you show the details of that? 0 should always be an at least
semi-valid index pointing to _something_ whereas -EINVAL will cause it
to point to random memory that isn't even valid...

johannes


--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/net/mac80211.h b/include/net/mac80211.h
index e33fe79..d117019 100644
--- a/include/net/mac80211.h
+++ b/include/net/mac80211.h
@@ -3108,10 +3108,10 @@  rate_lowest_index(struct ieee80211_supported_band *sband,
 		if (rate_supported(sta, sband->band, i))
 			return i;
 
-	/* warn when we cannot find a rate. */
-	WARN_ON(1);
+	/* If we cannot find any rate, we are in trouble. */
+	BUG_ON(1);
 
-	return 0;
+	return -EINVAL;
 }
 
 static inline