diff mbox

[PATCHv3,2/9] ir-rx51: Handle signals properly

Message ID 1346349271-28073-3-git-send-email-timo.t.kokkonen@iki.fi (mailing list archive)
State New, archived
Headers show

Commit Message

Timo Kokkonen Aug. 30, 2012, 5:54 p.m. UTC
The lirc-dev expects the ir-code to be transmitted when the write call
returns back to the user space. We should not leave TX ongoing no
matter what is the reason we return to the user space. Easiest
solution for that is to simply remove interruptible sleeps.

The first wait_event_interruptible is thus replaced with return -EBUSY
in case there is still ongoing transfer. This should suffice as the
concept of sending multiple codes in parallel does not make sense.

The second wait_event_interruptible call is replaced with
wait_even_timeout with a fixed and safe timeout that should prevent
the process from getting stuck in kernel for too long.

Also, from now on we will force the TX to stop before we return from
write call. If the TX happened to time out for some reason, we should
not leave the HW transmitting anything.

Signed-off-by: Timo Kokkonen <timo.t.kokkonen@iki.fi>
---
 drivers/media/rc/ir-rx51.c | 39 ++++++++++++++++++++++++++++-----------
 1 file changed, 28 insertions(+), 11 deletions(-)

Comments

Sakari Ailus Sept. 1, 2012, 5:14 p.m. UTC | #1
Moi,

On Thu, Aug 30, 2012 at 08:54:24PM +0300, Timo Kokkonen wrote:
> @@ -273,9 +281,18 @@ static ssize_t lirc_rx51_write(struct file *file, const char *buf,
>  
>  	/*
>  	 * Don't return back to the userspace until the transfer has
> -	 * finished
> +	 * finished. However, we wish to not spend any more than 500ms
> +	 * in kernel. No IR code TX should ever take that long.
> +	 */
> +	i = wait_event_timeout(lirc_rx51->wqueue, lirc_rx51->wbuf_index < 0,
> +			HZ / 2);

Why such an arbitrary timeout? In reality it might not bite the user space
in practice ever, but is it (and if so, why) really required in the first
place?

Cheers,
Timo Kokkonen Sept. 2, 2012, 2:54 p.m. UTC | #2
Terve,

On 09/01/12 20:14, Sakari Ailus wrote:
> Moi,
> 
> On Thu, Aug 30, 2012 at 08:54:24PM +0300, Timo Kokkonen wrote:
>> @@ -273,9 +281,18 @@ static ssize_t lirc_rx51_write(struct file *file, const char *buf,
>>  
>>  	/*
>>  	 * Don't return back to the userspace until the transfer has
>> -	 * finished
>> +	 * finished. However, we wish to not spend any more than 500ms
>> +	 * in kernel. No IR code TX should ever take that long.
>> +	 */
>> +	i = wait_event_timeout(lirc_rx51->wqueue, lirc_rx51->wbuf_index < 0,
>> +			HZ / 2);
> 
> Why such an arbitrary timeout? In reality it might not bite the user space
> in practice ever, but is it (and if so, why) really required in the first
> place?

Well, I can think of two cases:

1) Something goes wrong. Such before I converted the patch to use the up
to date PM QoS implementation, the transmitting could take very long
time because the interrupts were not waking up the MPU. Now that this is
sorted out only unknown bugs can cause transmitting to hang indefinitely.

2) User is (intentionally?) doing something wrong. For example by
feeding in an IR code that has got very long pulses, he could end up
having the lircd process hung in kernel unkillable for long time. That
could be avoided quite easily by counting the pulse lengths and
rejecting any IR codes that are obviously too long. But since I'd like
to also protect against 1) case, I think this solution works just fine.

In the end, this is just safety measure that this driver behaves well.

-Timo
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sakari Ailus Sept. 2, 2012, 3:06 p.m. UTC | #3
Heippa,

Timo Kokkonen wrote:
> Terve,
>
> On 09/01/12 20:14, Sakari Ailus wrote:
>> Moi,
>>
>> On Thu, Aug 30, 2012 at 08:54:24PM +0300, Timo Kokkonen wrote:
>>> @@ -273,9 +281,18 @@ static ssize_t lirc_rx51_write(struct file *file, const char *buf,
>>>
>>>   	/*
>>>   	 * Don't return back to the userspace until the transfer has
>>> -	 * finished
>>> +	 * finished. However, we wish to not spend any more than 500ms
>>> +	 * in kernel. No IR code TX should ever take that long.
>>> +	 */
>>> +	i = wait_event_timeout(lirc_rx51->wqueue, lirc_rx51->wbuf_index < 0,
>>> +			HZ / 2);
>>
>> Why such an arbitrary timeout? In reality it might not bite the user space
>> in practice ever, but is it (and if so, why) really required in the first
>> place?
>
> Well, I can think of two cases:
>
> 1) Something goes wrong. Such before I converted the patch to use the up
> to date PM QoS implementation, the transmitting could take very long
> time because the interrupts were not waking up the MPU. Now that this is
> sorted out only unknown bugs can cause transmitting to hang indefinitely.
>
> 2) User is (intentionally?) doing something wrong. For example by
> feeding in an IR code that has got very long pulses, he could end up
> having the lircd process hung in kernel unkillable for long time. That
> could be avoided quite easily by counting the pulse lengths and
> rejecting any IR codes that are obviously too long. But since I'd like
> to also protect against 1) case, I think this solution works just fine.
>
> In the end, this is just safety measure that this driver behaves well.

In that case I think you should use wait_event_interruptible() instead. 
It's not the driver's job to decide what the user can do with the 
hardware and what not, is it?

Terveisin,
Timo Kokkonen Sept. 2, 2012, 3:20 p.m. UTC | #4
On 09.02 2012 18:06:34, Sakari Ailus wrote:
> Heippa,
> 
> Timo Kokkonen wrote:
> > Terve,
> >
> > On 09/01/12 20:14, Sakari Ailus wrote:
> >> Moi,
> >>
> >> On Thu, Aug 30, 2012 at 08:54:24PM +0300, Timo Kokkonen wrote:
> >>> @@ -273,9 +281,18 @@ static ssize_t lirc_rx51_write(struct file *file, const char *buf,
> >>>
> >>>   	/*
> >>>   	 * Don't return back to the userspace until the transfer has
> >>> -	 * finished
> >>> +	 * finished. However, we wish to not spend any more than 500ms
> >>> +	 * in kernel. No IR code TX should ever take that long.
> >>> +	 */
> >>> +	i = wait_event_timeout(lirc_rx51->wqueue, lirc_rx51->wbuf_index < 0,
> >>> +			HZ / 2);
> >>
> >> Why such an arbitrary timeout? In reality it might not bite the user space
> >> in practice ever, but is it (and if so, why) really required in the first
> >> place?
> >
> > Well, I can think of two cases:
> >
> > 1) Something goes wrong. Such before I converted the patch to use the up
> > to date PM QoS implementation, the transmitting could take very long
> > time because the interrupts were not waking up the MPU. Now that this is
> > sorted out only unknown bugs can cause transmitting to hang indefinitely.
> >
> > 2) User is (intentionally?) doing something wrong. For example by
> > feeding in an IR code that has got very long pulses, he could end up
> > having the lircd process hung in kernel unkillable for long time. That
> > could be avoided quite easily by counting the pulse lengths and
> > rejecting any IR codes that are obviously too long. But since I'd like
> > to also protect against 1) case, I think this solution works just fine.
> >
> > In the end, this is just safety measure that this driver behaves well.
> 
> In that case I think you should use wait_event_interruptible() instead. 

Well, that's what I had there in the first place. With interruptible
wait we are left with problem with signals. I was told by Sean Young
that the lirc API expects the write call to finish only after the IR
code is transmitted.

> It's not the driver's job to decide what the user can do with the 
> hardware and what not, is it?

Yeah, policy should be decided by the user space. However, kernel
should not leave any objvious denial of service holes open
either. Allowing a process to get stuck unkillable within kernel for
long time sounds like one to me.

Anyway, we are trying to cover some rare corner cases here, I'm not
sure how it should work exactly..

-Timo

> 
> Terveisin,
> 
> -- 
> Sakari Ailus
> sakari.ailus@iki.fi
> --
> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sakari Ailus Sept. 2, 2012, 7:41 p.m. UTC | #5
On Sun, Sep 02, 2012 at 06:20:27PM +0300, Timo Kokkonen wrote:
> On 09.02 2012 18:06:34, Sakari Ailus wrote:
> > Heippa,
> > 
> > Timo Kokkonen wrote:
> > > Terve,
> > >
> > > On 09/01/12 20:14, Sakari Ailus wrote:
> > >> Moi,
> > >>
> > >> On Thu, Aug 30, 2012 at 08:54:24PM +0300, Timo Kokkonen wrote:
> > >>> @@ -273,9 +281,18 @@ static ssize_t lirc_rx51_write(struct file *file, const char *buf,
> > >>>
> > >>>   	/*
> > >>>   	 * Don't return back to the userspace until the transfer has
> > >>> -	 * finished
> > >>> +	 * finished. However, we wish to not spend any more than 500ms
> > >>> +	 * in kernel. No IR code TX should ever take that long.
> > >>> +	 */
> > >>> +	i = wait_event_timeout(lirc_rx51->wqueue, lirc_rx51->wbuf_index < 0,
> > >>> +			HZ / 2);
> > >>
> > >> Why such an arbitrary timeout? In reality it might not bite the user space
> > >> in practice ever, but is it (and if so, why) really required in the first
> > >> place?
> > >
> > > Well, I can think of two cases:
> > >
> > > 1) Something goes wrong. Such before I converted the patch to use the up
> > > to date PM QoS implementation, the transmitting could take very long
> > > time because the interrupts were not waking up the MPU. Now that this is
> > > sorted out only unknown bugs can cause transmitting to hang indefinitely.
> > >
> > > 2) User is (intentionally?) doing something wrong. For example by
> > > feeding in an IR code that has got very long pulses, he could end up
> > > having the lircd process hung in kernel unkillable for long time. That
> > > could be avoided quite easily by counting the pulse lengths and
> > > rejecting any IR codes that are obviously too long. But since I'd like
> > > to also protect against 1) case, I think this solution works just fine.
> > >
> > > In the end, this is just safety measure that this driver behaves well.
> > 
> > In that case I think you should use wait_event_interruptible() instead. 
> 
> Well, that's what I had there in the first place. With interruptible
> wait we are left with problem with signals. I was told by Sean Young
> that the lirc API expects the write call to finish only after the IR
> code is transmitted.
> 
> > It's not the driver's job to decide what the user can do with the 
> > hardware and what not, is it?
> 
> Yeah, policy should be decided by the user space. However, kernel
> should not leave any objvious denial of service holes open
> either. Allowing a process to get stuck unkillable within kernel for
> long time sounds like one to me.

It's interruptible, so the user space can interrupt that wait if it chooses
so. Besides, if you call this denial of service, then capturing video on
V4L2 is, too, since others can't use the device in the meantime. :-)

> Anyway, we are trying to cover some rare corner cases here, I'm not
> sure how it should work exactly..

If there was a generic maximum timeout for sending a code, wouldn't it make
sense to enforce that in the LIRC framework instead?

Terveisin,
Timo Kokkonen Sept. 2, 2012, 8:08 p.m. UTC | #6
On 09/02/12 22:41, Sakari Ailus wrote:
> On Sun, Sep 02, 2012 at 06:20:27PM +0300, Timo Kokkonen wrote:
>> On 09.02 2012 18:06:34, Sakari Ailus wrote:
>>> Heippa,
>>>
>>> Timo Kokkonen wrote:
>>>> Terve,
>>>>
>>>> On 09/01/12 20:14, Sakari Ailus wrote:
>>>>> Moi,
>>>>>
>>>>> On Thu, Aug 30, 2012 at 08:54:24PM +0300, Timo Kokkonen wrote:
>>>>>> @@ -273,9 +281,18 @@ static ssize_t lirc_rx51_write(struct file *file, const char *buf,
>>>>>>
>>>>>>   	/*
>>>>>>   	 * Don't return back to the userspace until the transfer has
>>>>>> -	 * finished
>>>>>> +	 * finished. However, we wish to not spend any more than 500ms
>>>>>> +	 * in kernel. No IR code TX should ever take that long.
>>>>>> +	 */
>>>>>> +	i = wait_event_timeout(lirc_rx51->wqueue, lirc_rx51->wbuf_index < 0,
>>>>>> +			HZ / 2);
>>>>>
>>>>> Why such an arbitrary timeout? In reality it might not bite the user space
>>>>> in practice ever, but is it (and if so, why) really required in the first
>>>>> place?
>>>>
>>>> Well, I can think of two cases:
>>>>
>>>> 1) Something goes wrong. Such before I converted the patch to use the up
>>>> to date PM QoS implementation, the transmitting could take very long
>>>> time because the interrupts were not waking up the MPU. Now that this is
>>>> sorted out only unknown bugs can cause transmitting to hang indefinitely.
>>>>
>>>> 2) User is (intentionally?) doing something wrong. For example by
>>>> feeding in an IR code that has got very long pulses, he could end up
>>>> having the lircd process hung in kernel unkillable for long time. That
>>>> could be avoided quite easily by counting the pulse lengths and
>>>> rejecting any IR codes that are obviously too long. But since I'd like
>>>> to also protect against 1) case, I think this solution works just fine.
>>>>
>>>> In the end, this is just safety measure that this driver behaves well.
>>>
>>> In that case I think you should use wait_event_interruptible() instead. 
>>
>> Well, that's what I had there in the first place. With interruptible
>> wait we are left with problem with signals. I was told by Sean Young
>> that the lirc API expects the write call to finish only after the IR
>> code is transmitted.
>>
>>> It's not the driver's job to decide what the user can do with the 
>>> hardware and what not, is it?
>>
>> Yeah, policy should be decided by the user space. However, kernel
>> should not leave any objvious denial of service holes open
>> either. Allowing a process to get stuck unkillable within kernel for
>> long time sounds like one to me.
> 
> It's interruptible, so the user space can interrupt that wait if it chooses
> so. Besides, if you call this denial of service, then capturing video on
> V4L2 is, too, since others can't use the device in the meantime. :-)
> 

Well, of course there is no problem if we use interruptible waits. But I
was told by Sean that the lirc API expects the IR TX to be finished
always when the write call returns. I guess the assumption is to avoid
breaking the transmission in the middle in case the process is signaled.
And that's why we shouldn't use interruptible waits.

However, if we allow simply breaking the transmitting in case the
process is signaled any way during the transmission, then the handling
would be trivial in the driver. That is, if someone for example kills or
stops the lirc daemon process, then the IR code just wouldn't finish ever.

Sean, do you have an opinion how this should or is allowed to work?

>> Anyway, we are trying to cover some rare corner cases here, I'm not
>> sure how it should work exactly..
> 
> If there was a generic maximum timeout for sending a code, wouldn't it make
> sense to enforce that in the LIRC framework instead?
> 

Yes, I agree it makes sense to leave unrestricted. But in that case we
definitely have to use interruptible waits in case user space is doing
something stupid and regrets it later :)

> Terveisin,
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sean Young Sept. 3, 2012, 12:36 p.m. UTC | #7
On Sun, Sep 02, 2012 at 11:08:20PM +0300, Timo Kokkonen wrote:
> On 09/02/12 22:41, Sakari Ailus wrote:
> > On Sun, Sep 02, 2012 at 06:20:27PM +0300, Timo Kokkonen wrote:
> >> On 09.02 2012 18:06:34, Sakari Ailus wrote:
> >>> Heippa,
> >>>
> >>> Timo Kokkonen wrote:
> >>>> Terve,
> >>>>
> >>>> On 09/01/12 20:14, Sakari Ailus wrote:
> >>>>> Moi,
> >>>>>
> >>>>> On Thu, Aug 30, 2012 at 08:54:24PM +0300, Timo Kokkonen wrote:
> >>>>>> @@ -273,9 +281,18 @@ static ssize_t lirc_rx51_write(struct file *file, const char *buf,
> >>>>>>
> >>>>>>   	/*
> >>>>>>   	 * Don't return back to the userspace until the transfer has
> >>>>>> -	 * finished
> >>>>>> +	 * finished. However, we wish to not spend any more than 500ms
> >>>>>> +	 * in kernel. No IR code TX should ever take that long.
> >>>>>> +	 */
> >>>>>> +	i = wait_event_timeout(lirc_rx51->wqueue, lirc_rx51->wbuf_index < 0,
> >>>>>> +			HZ / 2);
> >>>>>
> >>>>> Why such an arbitrary timeout? In reality it might not bite the user space
> >>>>> in practice ever, but is it (and if so, why) really required in the first
> >>>>> place?
> >>>>
> >>>> Well, I can think of two cases:
> >>>>
> >>>> 1) Something goes wrong. Such before I converted the patch to use the up
> >>>> to date PM QoS implementation, the transmitting could take very long
> >>>> time because the interrupts were not waking up the MPU. Now that this is
> >>>> sorted out only unknown bugs can cause transmitting to hang indefinitely.
> >>>>
> >>>> 2) User is (intentionally?) doing something wrong. For example by
> >>>> feeding in an IR code that has got very long pulses, he could end up
> >>>> having the lircd process hung in kernel unkillable for long time. That
> >>>> could be avoided quite easily by counting the pulse lengths and
> >>>> rejecting any IR codes that are obviously too long. But since I'd like
> >>>> to also protect against 1) case, I think this solution works just fine.
> >>>>
> >>>> In the end, this is just safety measure that this driver behaves well.
> >>>
> >>> In that case I think you should use wait_event_interruptible() instead. 
> >>
> >> Well, that's what I had there in the first place. With interruptible
> >> wait we are left with problem with signals. I was told by Sean Young
> >> that the lirc API expects the write call to finish only after the IR
> >> code is transmitted.
> >>
> >>> It's not the driver's job to decide what the user can do with the 
> >>> hardware and what not, is it?
> >>
> >> Yeah, policy should be decided by the user space. However, kernel
> >> should not leave any objvious denial of service holes open
> >> either. Allowing a process to get stuck unkillable within kernel for
> >> long time sounds like one to me.

It's not elegant, but this can't be used as a denial of service attack.
The driver waits for a maximum of a half a second after which signals
are serviced as normal.

> > It's interruptible, so the user space can interrupt that wait if it chooses
> > so. Besides, if you call this denial of service, then capturing video on
> > V4L2 is, too, since others can't use the device in the meantime. :-)
> > 
> 
> Well, of course there is no problem if we use interruptible waits. But I
> was told by Sean that the lirc API expects the IR TX to be finished
> always when the write call returns.

This is part of the ABI. The lircd deamon might want to do gap calculation
if there are large spaces in the IR code being sent. Maybe others can
enlighten us why such an ABI was choosen.

> I guess the assumption is to avoid
> breaking the transmission in the middle in case the process is signaled.
> And that's why we shouldn't use interruptible waits.
>
> However, if we allow simply breaking the transmitting in case the
> process is signaled any way during the transmission, then the handling
> would be trivial in the driver. That is, if someone for example kills or
> stops the lirc daemon process, then the IR code just wouldn't finish ever.
> 
> Sean, do you have an opinion how this should or is allowed to work?

You want to know when the hardware is done sending the IR. If you return
EINTR to user space, how would user space know how much IR has been sent, 
if any?

This ABI is not particularily elegant so there are proposals for a better
interface which would obsolete the lirc interface. David Hardeman has
worked on this:

http://patchwork.linuxtv.org/patch/11411/

> >> Anyway, we are trying to cover some rare corner cases here, I'm not
> >> sure how it should work exactly..
> > 
> > If there was a generic maximum timeout for sending a code, wouldn't it make
> > sense to enforce that in the LIRC framework instead?
> > 
> 
> Yes, I agree it makes sense to leave unrestricted. But in that case we
> definitely have to use interruptible waits in case user space is doing
> something stupid and regrets it later :)

Only for 500ms. 


Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Härdeman Sept. 3, 2012, 9:41 p.m. UTC | #8
Hej,

On Mon, Sep 03, 2012 at 01:36:53PM +0100, Sean Young wrote:
>On Sun, Sep 02, 2012 at 11:08:20PM +0300, Timo Kokkonen wrote:
>> I guess the assumption is to avoid
>> breaking the transmission in the middle in case the process is signaled.
>> And that's why we shouldn't use interruptible waits.
>>
>> However, if we allow simply breaking the transmitting in case the
>> process is signaled any way during the transmission, then the handling
>> would be trivial in the driver. That is, if someone for example kills or
>> stops the lirc daemon process, then the IR code just wouldn't finish ever.
>> 
>> Sean, do you have an opinion how this should or is allowed to work?
>
>You want to know when the hardware is done sending the IR. If you return
>EINTR to user space, how would user space know how much IR has been sent, 
>if any?
>
>This ABI is not particularily elegant so there are proposals for a better
>interface which would obsolete the lirc interface. David Hardeman has
>worked on this:
>
>http://patchwork.linuxtv.org/patch/11411/
>

Yes, the first step is an asynchronous interface using a kfifo which is
managed/fed using functionality in rc-core and drained by the drivers.

The size of the kfifo() itself is the only limiting factor right now,
but I do think we should eventually add some restrictions on the combined
duration of the pulse/space timings that are in the queue at any given
point.

Say, for example, that any given pulse/space value is not allowed to be
above 500ms and the total duration of the queue is not allowed to be
above 1000ms. In case user-space wants (for whatever reason)...to write
a 4000ms space, it would have to do so in 8 messages of 500ms each.

Each message write() provides the opportunity for a interruptible wait
(in the regular case) or returning EAGAIN (in the O_NONBLOCK case) -
assuming that the kfifo already holds pulse/space timing totalling
1000ms and/or is full.

EINTR should only be returned if nothing has been written to the kfifo
at all.

That way we would avoid policy in kernel while still making it possible
to kill a misbehaving user-space process by forcing it to drip feed long
TX sequences.
Sean Young Sept. 4, 2012, 12:43 p.m. UTC | #9
On Mon, Sep 03, 2012 at 11:41:55PM +0200, David Härdeman wrote:
> Hej,
> 
> On Mon, Sep 03, 2012 at 01:36:53PM +0100, Sean Young wrote:
> >On Sun, Sep 02, 2012 at 11:08:20PM +0300, Timo Kokkonen wrote:
> >> I guess the assumption is to avoid
> >> breaking the transmission in the middle in case the process is signaled.
> >> And that's why we shouldn't use interruptible waits.
> >>
> >> However, if we allow simply breaking the transmitting in case the
> >> process is signaled any way during the transmission, then the handling
> >> would be trivial in the driver. That is, if someone for example kills or
> >> stops the lirc daemon process, then the IR code just wouldn't finish ever.
> >> 
> >> Sean, do you have an opinion how this should or is allowed to work?
> >
> >You want to know when the hardware is done sending the IR. If you return
> >EINTR to user space, how would user space know how much IR has been sent, 
> >if any?
> >
> >This ABI is not particularily elegant so there are proposals for a better
> >interface which would obsolete the lirc interface. David Hardeman has
> >worked on this:
> >
> >http://patchwork.linuxtv.org/patch/11411/
> >
> 
> Yes, the first step is an asynchronous interface using a kfifo which is
> managed/fed using functionality in rc-core and drained by the drivers.
> 
> The size of the kfifo() itself is the only limiting factor right now,
> but I do think we should eventually add some restrictions on the combined
> duration of the pulse/space timings that are in the queue at any given
> point.

While we're at it, it would be useful to check that there no 0s in 
the timings, I'm not sure how well the drivers deal with those.

> Say, for example, that any given pulse/space value is not allowed to be
> above 500ms and the total duration of the queue is not allowed to be
> above 1000ms. In case user-space wants (for whatever reason)...to write
> a 4000ms space, it would have to do so in 8 messages of 500ms each.
> 
> Each message write() provides the opportunity for a interruptible wait
> (in the regular case) or returning EAGAIN (in the O_NONBLOCK case) -
> assuming that the kfifo already holds pulse/space timing totalling
> 1000ms and/or is full.
> 
> EINTR should only be returned if nothing has been written to the kfifo
> at all.

This interface is much better but it's also an ABI change. How should this
be handled? Should rc-core expose it's own /dev/rc[0-9] device with its
own ioctls?

> That way we would avoid policy in kernel while still making it possible
> to kill a misbehaving user-space process by forcing it to drip feed long
> TX sequences.

Is the purpose of this to prevent one user space program from writing
minutes worth of IR and then killing the process won't help since it's
already in the kfifo?

In that case I'd say that close() should purge the kfifo and user space
needs to do fsync to ensure that all IR has been sent.


Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Timo Kokkonen Sept. 14, 2012, 7:58 a.m. UTC | #10
On 09/03/12 15:36, Sean Young wrote:
> On Sun, Sep 02, 2012 at 11:08:20PM +0300, Timo Kokkonen wrote:
>> On 09/02/12 22:41, Sakari Ailus wrote:
>>> On Sun, Sep 02, 2012 at 06:20:27PM +0300, Timo Kokkonen wrote:
>>>> On 09.02 2012 18:06:34, Sakari Ailus wrote:
>>>>> Heippa,
>>>>>
>>>>> Timo Kokkonen wrote:
>>>>>> Terve,
>>>>>>
>>>>>> On 09/01/12 20:14, Sakari Ailus wrote:
>>>>>>> Moi,
>>>>>>>
>>>>>>> On Thu, Aug 30, 2012 at 08:54:24PM +0300, Timo Kokkonen wrote:
>>>>>>>> @@ -273,9 +281,18 @@ static ssize_t lirc_rx51_write(struct file *file, const char *buf,
>>>>>>>>
>>>>>>>>   	/*
>>>>>>>>   	 * Don't return back to the userspace until the transfer has
>>>>>>>> -	 * finished
>>>>>>>> +	 * finished. However, we wish to not spend any more than 500ms
>>>>>>>> +	 * in kernel. No IR code TX should ever take that long.
>>>>>>>> +	 */
>>>>>>>> +	i = wait_event_timeout(lirc_rx51->wqueue, lirc_rx51->wbuf_index < 0,
>>>>>>>> +			HZ / 2);
>>>>>>>
>>>>>>> Why such an arbitrary timeout? In reality it might not bite the user space
>>>>>>> in practice ever, but is it (and if so, why) really required in the first
>>>>>>> place?
>>>>>>
>>>>>> Well, I can think of two cases:
>>>>>>
>>>>>> 1) Something goes wrong. Such before I converted the patch to use the up
>>>>>> to date PM QoS implementation, the transmitting could take very long
>>>>>> time because the interrupts were not waking up the MPU. Now that this is
>>>>>> sorted out only unknown bugs can cause transmitting to hang indefinitely.
>>>>>>
>>>>>> 2) User is (intentionally?) doing something wrong. For example by
>>>>>> feeding in an IR code that has got very long pulses, he could end up
>>>>>> having the lircd process hung in kernel unkillable for long time. That
>>>>>> could be avoided quite easily by counting the pulse lengths and
>>>>>> rejecting any IR codes that are obviously too long. But since I'd like
>>>>>> to also protect against 1) case, I think this solution works just fine.
>>>>>>
>>>>>> In the end, this is just safety measure that this driver behaves well.
>>>>>
>>>>> In that case I think you should use wait_event_interruptible() instead. 
>>>>
>>>> Well, that's what I had there in the first place. With interruptible
>>>> wait we are left with problem with signals. I was told by Sean Young
>>>> that the lirc API expects the write call to finish only after the IR
>>>> code is transmitted.
>>>>
>>>>> It's not the driver's job to decide what the user can do with the 
>>>>> hardware and what not, is it?
>>>>
>>>> Yeah, policy should be decided by the user space. However, kernel
>>>> should not leave any objvious denial of service holes open
>>>> either. Allowing a process to get stuck unkillable within kernel for
>>>> long time sounds like one to me.
> 
> It's not elegant, but this can't be used as a denial of service attack.
> The driver waits for a maximum of a half a second after which signals
> are serviced as normal.
> 
>>> It's interruptible, so the user space can interrupt that wait if it chooses
>>> so. Besides, if you call this denial of service, then capturing video on
>>> V4L2 is, too, since others can't use the device in the meantime. :-)
>>>
>>
>> Well, of course there is no problem if we use interruptible waits. But I
>> was told by Sean that the lirc API expects the IR TX to be finished
>> always when the write call returns.
> 
> This is part of the ABI. The lircd deamon might want to do gap calculation
> if there are large spaces in the IR code being sent. Maybe others can
> enlighten us why such an ABI was choosen.
> 
>> I guess the assumption is to avoid
>> breaking the transmission in the middle in case the process is signaled.
>> And that's why we shouldn't use interruptible waits.
>>
>> However, if we allow simply breaking the transmitting in case the
>> process is signaled any way during the transmission, then the handling
>> would be trivial in the driver. That is, if someone for example kills or
>> stops the lirc daemon process, then the IR code just wouldn't finish ever.
>>
>> Sean, do you have an opinion how this should or is allowed to work?
> 
> You want to know when the hardware is done sending the IR. If you return
> EINTR to user space, how would user space know how much IR has been sent, 
> if any?
> 
> This ABI is not particularily elegant so there are proposals for a better
> interface which would obsolete the lirc interface. David Hardeman has
> worked on this:
> 
> http://patchwork.linuxtv.org/patch/11411/
> 

It appears that all "modern" lirc drivers are now using the rc-core
functionalities to implement the common stuff. When the rx51 lirc driver
was first written, the core was not in place yet. Therefore it is
implementing the file operations in the driver, which other rc drivers
won't do today.

So, I think it would make sense to modify the rx51 driver to use the rc
core functionality. But if there is an ABI change ongoing, I could wait
until you have that done before I start working on the change?

Considering this patch set, I think it makes sense still to apply these
as they improve the existing code base. I'll just squash the one patch
to the misc fixes, as pointed by Sakari, and then re-send the set.

-Timo

>>>> Anyway, we are trying to cover some rare corner cases here, I'm not
>>>> sure how it should work exactly..
>>>
>>> If there was a generic maximum timeout for sending a code, wouldn't it make
>>> sense to enforce that in the LIRC framework instead?
>>>
>>
>> Yes, I agree it makes sense to leave unrestricted. But in that case we
>> definitely have to use interruptible waits in case user space is doing
>> something stupid and regrets it later :)
> 
> Only for 500ms. 
> 
> 
> Sean
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sean Young Sept. 16, 2012, 7:42 p.m. UTC | #11
On Fri, Sep 14, 2012 at 10:58:53AM +0300, Timo Kokkonen wrote:
> It appears that all "modern" lirc drivers are now using the rc-core
> functionalities to implement the common stuff. When the rx51 lirc driver
> was first written, the core was not in place yet. Therefore it is
> implementing the file operations in the driver, which other rc drivers
> won't do today.
> 
> So, I think it would make sense to modify the rx51 driver to use the rc
> core functionality. But if there is an ABI change ongoing, I could wait
> until you have that done before I start working on the change?

There is no immediate need for porting to rc-core, AFAIK. OTOH I suspect 
that only some of the drivers using rc-core will only need to have their 
tx_ir method modified for a new sending/receiving ABI, so it shouldn't
stop you. If anything it might make the driver smaller.

At the moment I'm only just put initial patches together so I don't know 
when I or anyone else will have this finished. 


Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Härdeman Sept. 20, 2012, 9:43 a.m. UTC | #12
On Tue, 4 Sep 2012 13:43:56 +0100, Sean Young <sean@mess.org> wrote:
> This interface is much better but it's also an ABI change. How should
this
> be handled? Should rc-core expose it's own /dev/rc[0-9] device with its
> own ioctls?

That was the plan yes. I've posted a patchbomb in the past to the
linux-media mailing list which implements a rc specific chardev with an
ioctl/read/write based API.

Since the entire patchset is a bit much to digest and merging of patches
has been slow lately, I'm trying to drip-feed the patches. The lirc TX
rework was part of that process. It basically lays the groundwork for later
patches.

//David

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/media/rc/ir-rx51.c b/drivers/media/rc/ir-rx51.c
index 9487dd3..e2db94e 100644
--- a/drivers/media/rc/ir-rx51.c
+++ b/drivers/media/rc/ir-rx51.c
@@ -74,6 +74,19 @@  static void lirc_rx51_off(struct lirc_rx51 *lirc_rx51)
 			      OMAP_TIMER_TRIGGER_NONE);
 }
 
+static void lirc_rx51_stop_tx(struct lirc_rx51 *lirc_rx51)
+{
+	if (lirc_rx51->wbuf_index < 0)
+		return;
+
+	lirc_rx51_off(lirc_rx51);
+	lirc_rx51->wbuf_index = -1;
+	omap_dm_timer_stop(lirc_rx51->pwm_timer);
+	omap_dm_timer_stop(lirc_rx51->pulse_timer);
+	omap_dm_timer_set_int_enable(lirc_rx51->pulse_timer, 0);
+	wake_up(&lirc_rx51->wqueue);
+}
+
 static int init_timing_params(struct lirc_rx51 *lirc_rx51)
 {
 	u32 load, match;
@@ -160,13 +173,7 @@  static irqreturn_t lirc_rx51_interrupt_handler(int irq, void *ptr)
 
 	return IRQ_HANDLED;
 end:
-	/* Stop TX here */
-	lirc_rx51_off(lirc_rx51);
-	lirc_rx51->wbuf_index = -1;
-	omap_dm_timer_stop(lirc_rx51->pwm_timer);
-	omap_dm_timer_stop(lirc_rx51->pulse_timer);
-	omap_dm_timer_set_int_enable(lirc_rx51->pulse_timer, 0);
-	wake_up_interruptible(&lirc_rx51->wqueue);
+	lirc_rx51_stop_tx(lirc_rx51);
 
 	return IRQ_HANDLED;
 }
@@ -246,8 +253,9 @@  static ssize_t lirc_rx51_write(struct file *file, const char *buf,
 	if ((count > WBUF_LEN) || (count % 2 == 0))
 		return -EINVAL;
 
-	/* Wait any pending transfers to finish */
-	wait_event_interruptible(lirc_rx51->wqueue, lirc_rx51->wbuf_index < 0);
+	/* We can have only one transmit at a time */
+	if (lirc_rx51->wbuf_index >= 0)
+		return -EBUSY;
 
 	if (copy_from_user(lirc_rx51->wbuf, buf, n))
 		return -EFAULT;
@@ -273,9 +281,18 @@  static ssize_t lirc_rx51_write(struct file *file, const char *buf,
 
 	/*
 	 * Don't return back to the userspace until the transfer has
-	 * finished
+	 * finished. However, we wish to not spend any more than 500ms
+	 * in kernel. No IR code TX should ever take that long.
+	 */
+	i = wait_event_timeout(lirc_rx51->wqueue, lirc_rx51->wbuf_index < 0,
+			HZ / 2);
+
+	/*
+	 * Ensure transmitting has really stopped, even if the timers
+	 * went mad or something else happened that caused it still
+	 * sending out something.
 	 */
-	wait_event_interruptible(lirc_rx51->wqueue, lirc_rx51->wbuf_index < 0);
+	lirc_rx51_stop_tx(lirc_rx51);
 
 	/* We can sleep again */
 	lirc_rx51->pdata->set_max_mpu_wakeup_lat(lirc_rx51->dev, -1);