diff mbox

Fix effect aborting in ff-memless

Message ID c2278d36-993b-402f-dd07-635980112bbe@m-reimer.de (mailing list archive)
State New, archived
Headers show

Commit Message

Manuel Reimer June 19, 2016, 12:44 p.m. UTC
Hello,

while debugging a problem with hid-sony I got stuck with a problem 
actually caused by ff-memless. In some situations effect aborting is 
delayed, so it may be triggered seconds after all devices have been 
destroyed, which causes the kernel to panic.

The aborting request actually gets received here:
https://github.com/torvalds/linux/blob/master/drivers/input/ff-memless.c#L467
This "aborting" flag is then handled here:
https://github.com/torvalds/linux/blob/master/drivers/input/ff-memless.c#L376
But before this line is reached, there is a time check to check if the 
effect actually is due to be started:
https://github.com/torvalds/linux/blob/master/drivers/input/ff-memless.c#L359

This time check now causes a problem if the effect, which is meant to be 
*aborted* was scheduled to be *started* some time in future and the 
device is destroyed before this time is reached.

My patch fixes this by setting the trigger times to "now" after setting 
the aborting flag. This way the following code deals with aborting the 
effect immediately without setting a timer for it.

There may be other ways to fix this, so I would be happy to get some 
feedback.


Signed-off-by: Manuel Reimer <mail@m-reimer.de>

--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Dmitry Torokhov June 20, 2016, 5:33 p.m. UTC | #1
On Sun, Jun 19, 2016 at 02:44:35PM +0200, Manuel Reimer wrote:
> Hello,
> 
> while debugging a problem with hid-sony I got stuck with a problem
> actually caused by ff-memless. In some situations effect aborting is
> delayed, so it may be triggered seconds after all devices have been
> destroyed, which causes the kernel to panic.
> 
> The aborting request actually gets received here:
> https://github.com/torvalds/linux/blob/master/drivers/input/ff-memless.c#L467
> This "aborting" flag is then handled here:
> https://github.com/torvalds/linux/blob/master/drivers/input/ff-memless.c#L376
> But before this line is reached, there is a time check to check if
> the effect actually is due to be started:
> https://github.com/torvalds/linux/blob/master/drivers/input/ff-memless.c#L359
> 
> This time check now causes a problem if the effect, which is meant
> to be *aborted* was scheduled to be *started* some time in future
> and the device is destroyed before this time is reached.

I am not clear how this can happen. If effect hasn't actually started
playing (i.e. we have FF_EFFECT_STARTED bit set, but FF_EFFECT_PLAYING
is not yet set), then when stopping effect we do not need to do anything
except clear FF_EFFECT_STARTED (since we did not touch the hardware
yet).

Now, if FF_EFFECT_PLAYING is set, that means that play_at time is in
the past and we won't be skipping this effect in ml_get_combo_effect().

Could you please post a stack trace of the crash you observed?

> 
> My patch fixes this by setting the trigger times to "now" after
> setting the aborting flag. This way the following code deals with
> aborting the effect immediately without setting a timer for it.
> 
> There may be other ways to fix this, so I would be happy to get some
> feedback.
> 
> 
> Signed-off-by: Manuel Reimer <mail@m-reimer.de>
> 
> --- a/drivers/input/ff-memless.c	2016-05-13 16:06:29.722685021 +0200
> +++ b/drivers/input/ff-memless.c	2016-06-19 14:25:39.790375270 +0200
> @@ -463,9 +463,11 @@ static int ml_ff_playback(struct input_d
>  	} else {
>  		pr_debug("initiated stop\n");
> 
> -		if (test_bit(FF_EFFECT_PLAYING, &state->flags))
> +		if (test_bit(FF_EFFECT_PLAYING, &state->flags)) {
>  			__set_bit(FF_EFFECT_ABORTING, &state->flags);
> -		else
> +			state->play_at = jiffies;
> +			state->adj_at = jiffies;
> +		} else
>  			__clear_bit(FF_EFFECT_STARTED, &state->flags);
>  	}
> 

Thanks.
Manuel Reimer June 25, 2016, 1:28 p.m. UTC | #2
On 06/20/2016 07:33 PM, Dmitry Torokhov wrote:
> I am not clear how this can happen. If effect hasn't actually started
> playing (i.e. we have FF_EFFECT_STARTED bit set, but FF_EFFECT_PLAYING
> is not yet set), then when stopping effect we do not need to do anything
> except clear FF_EFFECT_STARTED (since we did not touch the hardware
> yet).

That's true, but I think my crash works a bit different.

What I'm doing is "hammering" the code with "start playback" events. 
Maybe not common in normal use, but it shouldn't be possible to crash 
the kernel with something like this.


For the crash to happen, the uploaded effect has to have a replay delay 
of some seconds. With this, what is actually happening is:

- The first playback request is accepted in ml_ff_playback
   (nonzero value to ml_ff_playback).
   --> FF_EFFECT_STARTED is set, FF_EFFECT_PLAYING not set
   --> play_at is in the future
- Some time later (play_at reached) the effect actually is started
   --> Now both, FF_EFFECT_STARTED and FF_EFFECT_PLAYING are set
   --> play_at is in the past
- Now a new playback request for the same effect is accepted
   in ml_ff_playback
   --> Still both, FF_EFFECT_STARTED and FF_EFFECT_PLAYING set
   --> play_at is now in the future!!!

=> That's the time where the USB plug is pulled
=> Kernel is trying to stop possible running effects
    Current situation:
    - The effect is playing and hasn't stopped playing so far
    - The same effect is scheduled to be playing again
      (play_at in future)

- Now we get a "stop playback" request (zero value to ml_ff_playback)
   --> FF_EFFECT_PLAYING still set --> FF_EFFECT_ABORTING will be set
   --> play_at still in the future
- The "FF_EFFECT_ABORTING" request now isn't handled directly in
   ml_get_combo_effect, as it is filtered out "play_at in future".
- A timer is set (as play_at is in future)
- Some time later the aborting is triggered with all memory already
   freed in both, ff-memless and hid-sony, modules.



With this knowledge, it now is *very* easy to reproduce this one:
- Open fftest on the device
- press "4" and "enter", immediately pre-enter a "4" after that
- As soon as the controller starts to vibrate, press enter and
   disconnect the controller
- Wait some seconds

Crashes *every time* with the hid-sony module.
Produces ugly output on dmesg with xpad module. Seems like some strings 
are read from freed memory in this case.

My patch fixes this by setting play_at to "now", so the 
"FF_EFFECT_ABORTING" is handled immediately in this case. --> No crash

Manuel
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dmitry Torokhov June 25, 2016, 3:29 p.m. UTC | #3
On Sat, Jun 25, 2016 at 03:28:05PM +0200, Manuel Reimer wrote:
> On 06/20/2016 07:33 PM, Dmitry Torokhov wrote:
> >I am not clear how this can happen. If effect hasn't actually started
> >playing (i.e. we have FF_EFFECT_STARTED bit set, but FF_EFFECT_PLAYING
> >is not yet set), then when stopping effect we do not need to do anything
> >except clear FF_EFFECT_STARTED (since we did not touch the hardware
> >yet).
> 
> That's true, but I think my crash works a bit different.
> 
> What I'm doing is "hammering" the code with "start playback" events.
> Maybe not common in normal use, but it shouldn't be possible to
> crash the kernel with something like this.
> 
> 
> For the crash to happen, the uploaded effect has to have a replay
> delay of some seconds. With this, what is actually happening is:
> 
> - The first playback request is accepted in ml_ff_playback
>   (nonzero value to ml_ff_playback).
>   --> FF_EFFECT_STARTED is set, FF_EFFECT_PLAYING not set
>   --> play_at is in the future
> - Some time later (play_at reached) the effect actually is started
>   --> Now both, FF_EFFECT_STARTED and FF_EFFECT_PLAYING are set
>   --> play_at is in the past
> - Now a new playback request for the same effect is accepted
>   in ml_ff_playback
>   --> Still both, FF_EFFECT_STARTED and FF_EFFECT_PLAYING set
>   --> play_at is now in the future!!!
> 
> => That's the time where the USB plug is pulled
> => Kernel is trying to stop possible running effects
>    Current situation:
>    - The effect is playing and hasn't stopped playing so far
>    - The same effect is scheduled to be playing again
>      (play_at in future)
> 
> - Now we get a "stop playback" request (zero value to ml_ff_playback)
>   --> FF_EFFECT_PLAYING still set --> FF_EFFECT_ABORTING will be set
>   --> play_at still in the future
> - The "FF_EFFECT_ABORTING" request now isn't handled directly in
>   ml_get_combo_effect, as it is filtered out "play_at in future".
> - A timer is set (as play_at is in future)
> - Some time later the aborting is triggered with all memory already
>   freed in both, ff-memless and hid-sony, modules.
> 
> 
> 
> With this knowledge, it now is *very* easy to reproduce this one:
> - Open fftest on the device
> - press "4" and "enter", immediately pre-enter a "4" after that
> - As soon as the controller starts to vibrate, press enter and
>   disconnect the controller
> - Wait some seconds
> 
> Crashes *every time* with the hid-sony module.
> Produces ugly output on dmesg with xpad module. Seems like some
> strings are read from freed memory in this case.
> 
> My patch fixes this by setting play_at to "now", so the
> "FF_EFFECT_ABORTING" is handled immediately in this case. --> No
> crash

I see. I however wonder what the proper behavior should be when we
basically request to restart the effect. If start delay is 0 then new
effect parameters would immediately be applied, so I guess it would be
reasonable to say that if we apply new start_delay to the effect then
currently playing instance should be stopped...

Anssi, what is your take here?

Thanks.
Anssi Hannula June 27, 2016, 6:31 p.m. UTC | #4
25.06.2016, 18:29, Dmitry Torokhov kirjoitti:
> On Sat, Jun 25, 2016 at 03:28:05PM +0200, Manuel Reimer wrote:
>> On 06/20/2016 07:33 PM, Dmitry Torokhov wrote:
>>> I am not clear how this can happen. If effect hasn't actually started
>>> playing (i.e. we have FF_EFFECT_STARTED bit set, but FF_EFFECT_PLAYING
>>> is not yet set), then when stopping effect we do not need to do anything
>>> except clear FF_EFFECT_STARTED (since we did not touch the hardware
>>> yet).
>>
>> That's true, but I think my crash works a bit different.
>>
>> What I'm doing is "hammering" the code with "start playback" events.
>> Maybe not common in normal use, but it shouldn't be possible to
>> crash the kernel with something like this.
>>
>>
>> For the crash to happen, the uploaded effect has to have a replay
>> delay of some seconds. With this, what is actually happening is:
>>
>> - The first playback request is accepted in ml_ff_playback
>>   (nonzero value to ml_ff_playback).
>>   --> FF_EFFECT_STARTED is set, FF_EFFECT_PLAYING not set
>>   --> play_at is in the future
>> - Some time later (play_at reached) the effect actually is started
>>   --> Now both, FF_EFFECT_STARTED and FF_EFFECT_PLAYING are set
>>   --> play_at is in the past
>> - Now a new playback request for the same effect is accepted
>>   in ml_ff_playback
>>   --> Still both, FF_EFFECT_STARTED and FF_EFFECT_PLAYING set
>>   --> play_at is now in the future!!!
>>
>> => That's the time where the USB plug is pulled
>> => Kernel is trying to stop possible running effects
>>    Current situation:
>>    - The effect is playing and hasn't stopped playing so far
>>    - The same effect is scheduled to be playing again
>>      (play_at in future)
>>
>> - Now we get a "stop playback" request (zero value to ml_ff_playback)
>>   --> FF_EFFECT_PLAYING still set --> FF_EFFECT_ABORTING will be set
>>   --> play_at still in the future
>> - The "FF_EFFECT_ABORTING" request now isn't handled directly in
>>   ml_get_combo_effect, as it is filtered out "play_at in future".
>> - A timer is set (as play_at is in future)
>> - Some time later the aborting is triggered with all memory already
>>   freed in both, ff-memless and hid-sony, modules.
>>
>>
>>
>> With this knowledge, it now is *very* easy to reproduce this one:
>> - Open fftest on the device
>> - press "4" and "enter", immediately pre-enter a "4" after that
>> - As soon as the controller starts to vibrate, press enter and
>>   disconnect the controller
>> - Wait some seconds
>>
>> Crashes *every time* with the hid-sony module.
>> Produces ugly output on dmesg with xpad module. Seems like some
>> strings are read from freed memory in this case.
>>
>> My patch fixes this by setting play_at to "now", so the
>> "FF_EFFECT_ABORTING" is handled immediately in this case. --> No
>> crash
> 
> I see. I however wonder what the proper behavior should be when we
> basically request to restart the effect. If start delay is 0 then new
> effect parameters would immediately be applied, so I guess it would be
> reasonable to say that if we apply new start_delay to the effect then
> currently playing instance should be stopped...
> 
> Anssi, what is your take here?

I agree with you, starting a delayed effect should immediately stop the
effect if it is already running.

So I guess e.g. ml_ff_playback() should set FF_EFFECT_ABORTING in such a
case, and ml_get_combo_effect() should process FF_EFFECT_ABORTING
effects regardless of their ->play_at being in the future (the latter
would fix the original issue too, AFAICS).


I guess an easier-to-follow state bit set would be e.g. the following:

FF_EFFECT_STARTED: always reflects started/stopped state
FF_EFFECT_PLAYING: always reflects hardware state
FF_EFFECT_DIRTY: effect requires immediate refresh (stop or modify)

which would make it clearer where each is set/cleared as there is no
more need for "delayed" handling of STARTED/PLAYING via ABORTING or
clearing FF_EFFECT_PLAYING in various places to force an effect refresh.
diff mbox

Patch

--- a/drivers/input/ff-memless.c	2016-05-13 16:06:29.722685021 +0200
+++ b/drivers/input/ff-memless.c	2016-06-19 14:25:39.790375270 +0200
@@ -463,9 +463,11 @@  static int ml_ff_playback(struct input_d
  	} else {
  		pr_debug("initiated stop\n");

-		if (test_bit(FF_EFFECT_PLAYING, &state->flags))
+		if (test_bit(FF_EFFECT_PLAYING, &state->flags)) {
  			__set_bit(FF_EFFECT_ABORTING, &state->flags);
-		else
+			state->play_at = jiffies;
+			state->adj_at = jiffies;
+		} else
  			__clear_bit(FF_EFFECT_STARTED, &state->flags);
  	}