diff mbox

dvb usb issues since kernel 4.9

Message ID CA+55aFwuAojr7vAfiRO-2je-wDs7pu+avQZNhX_k9NN=D7_zVQ@mail.gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Linus Torvalds Jan. 8, 2018, 7:51 p.m. UTC
On Mon, Jan 8, 2018 at 11:15 AM, Alan Stern <stern@rowland.harvard.edu> wrote:
>
> Both dwc2_hsotg and ehci-hcd use the tasklets embedded in the
> giveback_urb_bh member of struct usb_hcd.  See usb_hcd_giveback_urb()
> in drivers/usb/core/hcd.c; the calls are
>
>         else if (high_prio_bh)
>                 tasklet_hi_schedule(&bh->bh);
>         else
>                 tasklet_schedule(&bh->bh);
>
> As it turns out, high_prio_bh gets set for interrupt and isochronous
> URBs but not for bulk and control URBs.  The DVB driver in question
> uses bulk transfers.

Ok, so we could try out something like the appended?

NOTE! I have not tested this at all. It LooksObvious(tm), but...

                    Linus
kernel/softirq.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

Comments

Mauro Carvalho Chehab Jan. 9, 2018, 5:42 p.m. UTC | #1
Em Mon, 8 Jan 2018 11:51:04 -0800
Linus Torvalds <torvalds@linux-foundation.org> escreveu:

> On Mon, Jan 8, 2018 at 11:15 AM, Alan Stern <stern@rowland.harvard.edu> wrote:
> >
> > Both dwc2_hsotg and ehci-hcd use the tasklets embedded in the
> > giveback_urb_bh member of struct usb_hcd.  See usb_hcd_giveback_urb()
> > in drivers/usb/core/hcd.c; the calls are
> >
> >         else if (high_prio_bh)
> >                 tasklet_hi_schedule(&bh->bh);
> >         else
> >                 tasklet_schedule(&bh->bh);
> >
> > As it turns out, high_prio_bh gets set for interrupt and isochronous
> > URBs but not for bulk and control URBs.  The DVB driver in question
> > uses bulk transfers.  
> 
> Ok, so we could try out something like the appended?
> 
> NOTE! I have not tested this at all. It LooksObvious(tm), but...
> 
>                     Linus



>  kernel/softirq.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index 2f5e87f1bae2..97b080956fea 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -79,12 +79,16 @@ static void wakeup_softirqd(void)
>  
>  /*
>   * If ksoftirqd is scheduled, we do not want to process pending softirqs
> - * right now. Let ksoftirqd handle this at its own rate, to get fairness.
> + * right now. Let ksoftirqd handle this at its own rate, to get fairness,
> + * unless we're doing some of the synchronous softirqs.
>   */
> -static bool ksoftirqd_running(void)
> +#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
> +static bool ksoftirqd_running(unsigned long pending)
>  {
>  	struct task_struct *tsk = __this_cpu_read(ksoftirqd);
>  
> +	if (pending & SOFTIRQ_NOW_MASK)
> +		return false;
>  	return tsk && (tsk->state == TASK_RUNNING);
>  }
>  
> @@ -325,7 +329,7 @@ asmlinkage __visible void do_softirq(void)
>  
>  	pending = local_softirq_pending();
>  
> -	if (pending && !ksoftirqd_running())
> +	if (pending && !ksoftirqd_running(pending))
>  		do_softirq_own_stack();
>  
>  	local_irq_restore(flags);
> @@ -352,7 +356,7 @@ void irq_enter(void)
>  
>  static inline void invoke_softirq(void)
>  {
> -	if (ksoftirqd_running())
> +	if (ksoftirqd_running(local_softirq_pending()))
>  		return;
>  
>  	if (!force_irqthreads) {


Hi Linus,

Patch makes sense to me, although I was not able to test it myself.

I set a RPi3 machine here with vanilla Kernel 4.14.11 running a standard
raspbian distribution (with elevator=deadline). Right now, I'm trying to
reproduce the bug with dvbv5-zap. I may eventually do more tests on 
some other slow machines.

Usually, applications like tvheadend records just one channel. So, instead
of a ~58 Mbits/s payload, it uses, typically, ~11 Mbits/s for a HD channel.
This is usually filtered by hardware. Here, I'm forcing to record the
entire TS, in order to make easier to reproduce the issue. So, I'm forcing
a condition that it is usually worse than real usecases (at last for HD - I
I don't have any DVB stream here with a 4K channel).

From what I checked so far, with vanila upstream Kernel on RPi3, just
receiving a DVB stream - or receiving it and writing to /dev/null works
with or without your patch.

The problem starts to happen when there are concurrency with writes.

On my preliminar tests, writing to a file on an ext4 partition at a
USB stick loses data up to the point to make it useless (1/4 of the data
is lost!). However, writing to a class 10 microSD card is doable.

If you're curious enough, this is what I'm doing (that are the results
while using class 10 microSD card):

$ FILE=/tmp/out.ts; for i in $(seq 1 6); do echo "step $i"; rm $FILE 2>/dev/null; dvbv5-zap -l universal -c ~/vivo-channels.conf NBR -o $FILE -P -t60 2>&1|grep -E "(buffer|received)"; du $FILE 2>/dev/null; done 
step 1
  Setting buffer length to 7250000
buffer overrun
buffer overrun
buffer overrun
buffer overrun
buffer overrun
buffer overrun
buffer overrun
received 347504652 bytes (5656 Kbytes/sec)
339368	/tmp/out.ts
step 2
  Setting buffer length to 7250000
buffer overrun
received 408995880 bytes (6656 Kbytes/sec)
399416	/tmp/out.ts
step 3
  Setting buffer length to 7250000
received 412999716 bytes (6722 Kbytes/sec)
403328	/tmp/out.ts
step 4
  Setting buffer length to 7250000
buffer overrun
received 415564788 bytes (6763 Kbytes/sec)
405832	/tmp/out.ts
step 5
  Setting buffer length to 7250000
received 412999716 bytes (6722 Kbytes/sec)
403324	/tmp/out.ts
step 6
  Setting buffer length to 7250000
received 408366080 bytes (6646 Kbytes/sec)
398796	/tmp/out.ts

My plan is to do more tests along this week, and try to tweak a little
bit both userspace and kernelspace, in order to see if I can get better
results.

Thanks,
Mauro
Linus Torvalds Jan. 9, 2018, 5:55 p.m. UTC | #2
On Tue, Jan 9, 2018 at 9:42 AM, Mauro Carvalho Chehab
<mchehab@s-opensource.com> wrote:
>
> On my preliminar tests, writing to a file on an ext4 partition at a
> USB stick loses data up to the point to make it useless (1/4 of the data
> is lost!). However, writing to a class 10 microSD card is doable.

Note that most USB sticks are horrible crap. They can have write
latencies counted in _seconds_.

You can cause VM issues and various other non-hardware stalls with
them, simply because something gets stuck waiting for a page writeout
that should take a few ms on any reasonable hardware, but ends up
talking half a second or more.

For example, even really well-written software that tries to do things
like threaded write-behind to smooth out the IO will be _totally_
screwed by the USB stick behavior (where you might write a few MB at
high speeds, and then the next write - however small - takes a second
because the stupid USB stick does a synchronous garbage collection.
Suddenly all that clever software that tried to keep things moving
along smoothly without any hiccups, and tried hard to make the USB bus
have a nice constant loadm can't do anything at all about the crap
hardware.

So when testing writes to USB sticks, I'm not convinced you're
actually testing any USB bus limitations or even really any other
hardware limitations than the USB stick itself.

                  Linus
Jesper Dangaard Brouer Jan. 9, 2018, 9:26 p.m. UTC | #3
On Tue, 9 Jan 2018 15:42:35 -0200 Mauro Carvalho Chehab <mchehab@s-opensource.com> wrote:
> Em Mon, 8 Jan 2018 11:51:04 -0800 Linus Torvalds <torvalds@linux-foundation.org> escreveu:
> 
[...]
> Patch makes sense to me, although I was not able to test it myself.
 
The patch also make sense to me.  I've done some basic testing with it
on my high-end Broadwell system (that I use for 100Gbit/s testing). As
expected the network overload case still works, as NET_RX_SOFTIRQ is
not matched. 

> I set a RPi3 machine here with vanilla Kernel 4.14.11 running a
> standard raspbian distribution (with elevator=deadline).

I found a Raspberry Pi Model B+ (I think, BCM2835), that I loaded the
LibreELEC distro on.  One of the guys even created an image for me with
a specific kernel[1] (that I just upgraded the system with).

[1] https://forum.libreelec.tv/thread/4235-dvb-issue-since-le-switched-to-kernel-4-9-x/?postID=77031#post77031
 
> My plan is to do more tests along this week, and try to tweak a little
> bit both userspace and kernelspace, in order to see if I can get
> better results.
 
I've previously experienced that you can be affected by the scheduler
granularity, which is adjustable (with CONFIG_SCHED_DEBUG=y):

 $ grep -H . /proc/sys/kernel/sched_*_granularity_ns
 /proc/sys/kernel/sched_min_granularity_ns:2250000
 /proc/sys/kernel/sched_wakeup_granularity_ns:3000000

The above numbers were confirmed on the RPi2 (see[2]). With commit
4cd13c21b207 ("softirq: Let ksoftirqd do its job"), I expect/assume that
softirq processing latency is bounded by the sched_wakeup_granularity_ns,
which with 3 ms is not good enough for their use-case.

Thus, if you manage to reproduce the case, try to see if adjusting this
can mitigate the issue...


Their system have non-preempt kernel, should they use PREEMPT?

 LibreELEC:~ # uname -a
 Linux LibreELEC 4.14.10 #1 SMP Tue Jan 9 17:35:03 GMT 2018 armv7l GNU/Linux

[2] https://forum.libreelec.tv/thread/4235-dvb-issue-since-le-switched-to-kernel-4-9-x/?postID=76999#post76999
Mike Galbraith Jan. 10, 2018, 3:02 a.m. UTC | #4
On Tue, 2018-01-09 at 22:26 +0100, Jesper Dangaard Brouer wrote:
> 
> I've previously experienced that you can be affected by the scheduler
> granularity, which is adjustable (with CONFIG_SCHED_DEBUG=y):
> 
>  $ grep -H . /proc/sys/kernel/sched_*_granularity_ns
>  /proc/sys/kernel/sched_min_granularity_ns:2250000
>  /proc/sys/kernel/sched_wakeup_granularity_ns:3000000
> 
> The above numbers were confirmed on the RPi2 (see[2]). With commit
> 4cd13c21b207 ("softirq: Let ksoftirqd do its job"), I expect/assume that
> softirq processing latency is bounded by the sched_wakeup_granularity_ns,
> which with 3 ms is not good enough for their use-case.

Note of caution wrt twiddling sched_wakeup_granularity_ns: it must
remain < sched_latency_ns/2 else you effectively disable wakeup
preemption completely, turning CFS into a tick granularity scheduler.

	-Mike
Hanna Hawa July 17, 2018, 11:54 a.m. UTC | #5
Hi,

I'm a software developer working in Marvell SoC team.
I'm facing kernel panic issue while running raid 5 on sata disks 
connected to Macchiatobin (Marvell community board with Armada-8040 SoC 
with 4 ARMv8 cores of CA72)
Raid 5 built with Marvell DMA engine and async_tx mechanism 
(ASYNC_TX_DMA [=y]); the DMA driver (mv_xor_v2) uses a tasklet to clean 
the done descriptors from the queue.

The panic (see below) occurs while building the RAID-5 (mdadm) or while 
writing/reading to the raid partition.

After some debug/bisect/diff, found that patch "softirq: Let ksoftirqd 
do its job" is problematic patch.

- Using v4.14.0 and problematic patch reverted - no timout issue.
- Using v4.14.0 (including softirq patch) and the additional fix 
proposed by Linus - no timeout issue.

As others have reported in this thread, the softirq change is causing 
some regression.
Would it be possible to either revert the patch or apply a fix such as 
the one proposed by Linus ?

Below panic message:
[   25.371495] mv_xor_v2 f0400000.xor: dma_sync_wait: timeout!
[   25.377101] Kernel panic - not syncing: async_tx_quiesce: DMA error 
waiting for transaction
[   25.377101]
[   25.386973] CPU: 0 PID: 1417 Comm: md0_raid5 Not tainted 4.14.0 #16
[   25.393264] Hardware name: Marvell Armada 8040 DB board (DT)
[   25.398946] Call trace:
[   25.401410] [<ffff000008089310>] dump_backtrace+0x0/0x380
[   25.406831] [<ffff0000080896a4>] show_stack+0x14/0x20
[   25.411904] [<ffff00000890fa78>] dump_stack+0x98/0xb8
[   25.416976] [<ffff0000080c8ef0>] panic+0x118/0x280
[   25.421788] [<ffff000008386a44>] async_tx_quiesce+0x74/0x78
[   25.427382] [<ffff000008386ca4>] async_memcpy+0x1a4/0x2a0
[   25.432806] [<ffff000008747f9c>] async_copy_data.isra.16+0x1b4/0x280
[   25.439186] [<ffff00000874b6fc>] raid_run_ops+0x514/0x1320
[   25.444694] [<ffff000008751550>] handle_stripe+0x1040/0x2848
[   25.450377] [<ffff000008752f98>] 
handle_active_stripes.isra.28+0x240/0x460
[   25.457279] [<ffff000008753468>] raid5d+0x2b0/0x450
[   25.462177] [<ffff00000875ead4>] md_thread+0x104/0x160
[   25.467338] [<ffff0000080e638c>] kthread+0xfc/0x128
[   25.472234] [<ffff000008085354>] ret_from_fork+0x10/0x1c
[   25.477571] Kernel Offset: disabled
[   25.481073] CPU features: 0x002000
[   25.484487] Memory Limit: none
[   25.487556] ---[ end Kernel panic - not syncing: async_tx_quiesce: 
DMA error waiting for transaction
[   25.487556]

Thanks,
Hanna
Linus Torvalds July 17, 2018, 5:09 p.m. UTC | #6
On Tue, Jul 17, 2018 at 4:58 AM Hanna Hawa <hannah@marvell.com> wrote:
>
> After some debug/bisect/diff, found that patch "softirq: Let ksoftirqd
> do its job" is problematic patch.

Ok, this thread died down without any resolution.

>- Using v4.14.0 (including softirq patch) and the additional fix
> proposed by Linus - no timeout issue.

Are you talking about the patch that made HI_SOFTIRQ and
TASKLET_SOFTIRQ special, and had this:

  #define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))

in it?

I think I'll just commit the damn thing. It's hacky, but it's simple,
and it never got applied because we had smarter suggestions. But the
smarter suggestions never ended up being applied either, so..

                       Linus
Hanna Hawa July 17, 2018, 6:07 p.m. UTC | #7
Hi Linus,

On 07/17/2018 08:09 PM, Linus Torvalds wrote:
> On Tue, Jul 17, 2018 at 4:58 AM Hanna Hawa <hannah@marvell.com> wrote:
>>
>> After some debug/bisect/diff, found that patch "softirq: Let ksoftirqd
>> do its job" is problematic patch.
>
> Ok, this thread died down without any resolution.
>
>> - Using v4.14.0 (including softirq patch) and the additional fix
>> proposed by Linus - no timeout issue.
>
> Are you talking about the patch that made HI_SOFTIRQ and
> TASKLET_SOFTIRQ special, and had this:
>
>   #define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
>
> in it?
yes, exactly..

Link to the patch:
https://git.linuxtv.org/mchehab/experimental.git/commit/?h=v4.15%2bmedia%2bdwc2&id=ccf833fd4a5b99c3d3cf2c09c065670f74a230a7

Thanks,
Hanna

>
> I think I'll just commit the damn thing. It's hacky, but it's simple,
> and it never got applied because we had smarter suggestions. But the
> smarter suggestions never ended up being applied either, so..
>
>                        Linus
>
Mauro Carvalho Chehab July 17, 2018, 10:21 p.m. UTC | #8
Hi Linus,

Em Tue, 17 Jul 2018 10:09:28 -0700
Linus Torvalds <torvalds@linux-foundation.org> escreveu:

> On Tue, Jul 17, 2018 at 4:58 AM Hanna Hawa <hannah@marvell.com> wrote:
> >
> > After some debug/bisect/diff, found that patch "softirq: Let ksoftirqd
> > do its job" is problematic patch.  
> 
> Ok, this thread died down without any resolution.
> 
> >- Using v4.14.0 (including softirq patch) and the additional fix
> > proposed by Linus - no timeout issue.  
> 
> Are you talking about the patch that made HI_SOFTIRQ and
> TASKLET_SOFTIRQ special, and had this:
> 
>   #define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
> 
> in it?
> 
> I think I'll just commit the damn thing. It's hacky, but it's simple,
> and it never got applied because we had smarter suggestions. But the
> smarter suggestions never ended up being applied either, so..

Yeah, IMHO the best would be to apply your patch[1], c/c stable up to
4.9. Nothing prevents applying a better/smarter solution once we
have it. From my side, I can keep testing whatever smart suggestions
people propose. Yet, better to have one fix on our hand than two
fixes flying around.

[1] e. g. 
	https://git.linuxtv.org/mchehab/experimental.git/commit/?h=v4.15%2bmedia%2bdwc2&id=ccf833fd4a5b99c3d3cf2c09c065670f74a230a7

Regards,
Mauro
diff mbox

Patch

diff --git a/kernel/softirq.c b/kernel/softirq.c
index 2f5e87f1bae2..97b080956fea 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -79,12 +79,16 @@  static void wakeup_softirqd(void)
 
 /*
  * If ksoftirqd is scheduled, we do not want to process pending softirqs
- * right now. Let ksoftirqd handle this at its own rate, to get fairness.
+ * right now. Let ksoftirqd handle this at its own rate, to get fairness,
+ * unless we're doing some of the synchronous softirqs.
  */
-static bool ksoftirqd_running(void)
+#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
+static bool ksoftirqd_running(unsigned long pending)
 {
 	struct task_struct *tsk = __this_cpu_read(ksoftirqd);
 
+	if (pending & SOFTIRQ_NOW_MASK)
+		return false;
 	return tsk && (tsk->state == TASK_RUNNING);
 }
 
@@ -325,7 +329,7 @@  asmlinkage __visible void do_softirq(void)
 
 	pending = local_softirq_pending();
 
-	if (pending && !ksoftirqd_running())
+	if (pending && !ksoftirqd_running(pending))
 		do_softirq_own_stack();
 
 	local_irq_restore(flags);
@@ -352,7 +356,7 @@  void irq_enter(void)
 
 static inline void invoke_softirq(void)
 {
-	if (ksoftirqd_running())
+	if (ksoftirqd_running(local_softirq_pending()))
 		return;
 
 	if (!force_irqthreads) {