Message ID | CA+55aFwuAojr7vAfiRO-2je-wDs7pu+avQZNhX_k9NN=D7_zVQ@mail.gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Em Mon, 8 Jan 2018 11:51:04 -0800 Linus Torvalds <torvalds@linux-foundation.org> escreveu: > On Mon, Jan 8, 2018 at 11:15 AM, Alan Stern <stern@rowland.harvard.edu> wrote: > > > > Both dwc2_hsotg and ehci-hcd use the tasklets embedded in the > > giveback_urb_bh member of struct usb_hcd. See usb_hcd_giveback_urb() > > in drivers/usb/core/hcd.c; the calls are > > > > else if (high_prio_bh) > > tasklet_hi_schedule(&bh->bh); > > else > > tasklet_schedule(&bh->bh); > > > > As it turns out, high_prio_bh gets set for interrupt and isochronous > > URBs but not for bulk and control URBs. The DVB driver in question > > uses bulk transfers. > > Ok, so we could try out something like the appended? > > NOTE! I have not tested this at all. It LooksObvious(tm), but... > > Linus > kernel/softirq.c | 12 ++++++++---- > 1 file changed, 8 insertions(+), 4 deletions(-) > > diff --git a/kernel/softirq.c b/kernel/softirq.c > index 2f5e87f1bae2..97b080956fea 100644 > --- a/kernel/softirq.c > +++ b/kernel/softirq.c > @@ -79,12 +79,16 @@ static void wakeup_softirqd(void) > > /* > * If ksoftirqd is scheduled, we do not want to process pending softirqs > - * right now. Let ksoftirqd handle this at its own rate, to get fairness. > + * right now. Let ksoftirqd handle this at its own rate, to get fairness, > + * unless we're doing some of the synchronous softirqs. > */ > -static bool ksoftirqd_running(void) > +#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ)) > +static bool ksoftirqd_running(unsigned long pending) > { > struct task_struct *tsk = __this_cpu_read(ksoftirqd); > > + if (pending & SOFTIRQ_NOW_MASK) > + return false; > return tsk && (tsk->state == TASK_RUNNING); > } > > @@ -325,7 +329,7 @@ asmlinkage __visible void do_softirq(void) > > pending = local_softirq_pending(); > > - if (pending && !ksoftirqd_running()) > + if (pending && !ksoftirqd_running(pending)) > do_softirq_own_stack(); > > local_irq_restore(flags); > @@ -352,7 +356,7 @@ void irq_enter(void) > > static inline void invoke_softirq(void) > { > - if (ksoftirqd_running()) > + if (ksoftirqd_running(local_softirq_pending())) > return; > > if (!force_irqthreads) { Hi Linus, Patch makes sense to me, although I was not able to test it myself. I set a RPi3 machine here with vanilla Kernel 4.14.11 running a standard raspbian distribution (with elevator=deadline). Right now, I'm trying to reproduce the bug with dvbv5-zap. I may eventually do more tests on some other slow machines. Usually, applications like tvheadend records just one channel. So, instead of a ~58 Mbits/s payload, it uses, typically, ~11 Mbits/s for a HD channel. This is usually filtered by hardware. Here, I'm forcing to record the entire TS, in order to make easier to reproduce the issue. So, I'm forcing a condition that it is usually worse than real usecases (at last for HD - I I don't have any DVB stream here with a 4K channel). From what I checked so far, with vanila upstream Kernel on RPi3, just receiving a DVB stream - or receiving it and writing to /dev/null works with or without your patch. The problem starts to happen when there are concurrency with writes. On my preliminar tests, writing to a file on an ext4 partition at a USB stick loses data up to the point to make it useless (1/4 of the data is lost!). However, writing to a class 10 microSD card is doable. If you're curious enough, this is what I'm doing (that are the results while using class 10 microSD card): $ FILE=/tmp/out.ts; for i in $(seq 1 6); do echo "step $i"; rm $FILE 2>/dev/null; dvbv5-zap -l universal -c ~/vivo-channels.conf NBR -o $FILE -P -t60 2>&1|grep -E "(buffer|received)"; du $FILE 2>/dev/null; done step 1 Setting buffer length to 7250000 buffer overrun buffer overrun buffer overrun buffer overrun buffer overrun buffer overrun buffer overrun received 347504652 bytes (5656 Kbytes/sec) 339368 /tmp/out.ts step 2 Setting buffer length to 7250000 buffer overrun received 408995880 bytes (6656 Kbytes/sec) 399416 /tmp/out.ts step 3 Setting buffer length to 7250000 received 412999716 bytes (6722 Kbytes/sec) 403328 /tmp/out.ts step 4 Setting buffer length to 7250000 buffer overrun received 415564788 bytes (6763 Kbytes/sec) 405832 /tmp/out.ts step 5 Setting buffer length to 7250000 received 412999716 bytes (6722 Kbytes/sec) 403324 /tmp/out.ts step 6 Setting buffer length to 7250000 received 408366080 bytes (6646 Kbytes/sec) 398796 /tmp/out.ts My plan is to do more tests along this week, and try to tweak a little bit both userspace and kernelspace, in order to see if I can get better results. Thanks, Mauro
On Tue, Jan 9, 2018 at 9:42 AM, Mauro Carvalho Chehab <mchehab@s-opensource.com> wrote: > > On my preliminar tests, writing to a file on an ext4 partition at a > USB stick loses data up to the point to make it useless (1/4 of the data > is lost!). However, writing to a class 10 microSD card is doable. Note that most USB sticks are horrible crap. They can have write latencies counted in _seconds_. You can cause VM issues and various other non-hardware stalls with them, simply because something gets stuck waiting for a page writeout that should take a few ms on any reasonable hardware, but ends up talking half a second or more. For example, even really well-written software that tries to do things like threaded write-behind to smooth out the IO will be _totally_ screwed by the USB stick behavior (where you might write a few MB at high speeds, and then the next write - however small - takes a second because the stupid USB stick does a synchronous garbage collection. Suddenly all that clever software that tried to keep things moving along smoothly without any hiccups, and tried hard to make the USB bus have a nice constant loadm can't do anything at all about the crap hardware. So when testing writes to USB sticks, I'm not convinced you're actually testing any USB bus limitations or even really any other hardware limitations than the USB stick itself. Linus
On Tue, 9 Jan 2018 15:42:35 -0200 Mauro Carvalho Chehab <mchehab@s-opensource.com> wrote: > Em Mon, 8 Jan 2018 11:51:04 -0800 Linus Torvalds <torvalds@linux-foundation.org> escreveu: > [...] > Patch makes sense to me, although I was not able to test it myself. The patch also make sense to me. I've done some basic testing with it on my high-end Broadwell system (that I use for 100Gbit/s testing). As expected the network overload case still works, as NET_RX_SOFTIRQ is not matched. > I set a RPi3 machine here with vanilla Kernel 4.14.11 running a > standard raspbian distribution (with elevator=deadline). I found a Raspberry Pi Model B+ (I think, BCM2835), that I loaded the LibreELEC distro on. One of the guys even created an image for me with a specific kernel[1] (that I just upgraded the system with). [1] https://forum.libreelec.tv/thread/4235-dvb-issue-since-le-switched-to-kernel-4-9-x/?postID=77031#post77031 > My plan is to do more tests along this week, and try to tweak a little > bit both userspace and kernelspace, in order to see if I can get > better results. I've previously experienced that you can be affected by the scheduler granularity, which is adjustable (with CONFIG_SCHED_DEBUG=y): $ grep -H . /proc/sys/kernel/sched_*_granularity_ns /proc/sys/kernel/sched_min_granularity_ns:2250000 /proc/sys/kernel/sched_wakeup_granularity_ns:3000000 The above numbers were confirmed on the RPi2 (see[2]). With commit 4cd13c21b207 ("softirq: Let ksoftirqd do its job"), I expect/assume that softirq processing latency is bounded by the sched_wakeup_granularity_ns, which with 3 ms is not good enough for their use-case. Thus, if you manage to reproduce the case, try to see if adjusting this can mitigate the issue... Their system have non-preempt kernel, should they use PREEMPT? LibreELEC:~ # uname -a Linux LibreELEC 4.14.10 #1 SMP Tue Jan 9 17:35:03 GMT 2018 armv7l GNU/Linux [2] https://forum.libreelec.tv/thread/4235-dvb-issue-since-le-switched-to-kernel-4-9-x/?postID=76999#post76999
On Tue, 2018-01-09 at 22:26 +0100, Jesper Dangaard Brouer wrote: > > I've previously experienced that you can be affected by the scheduler > granularity, which is adjustable (with CONFIG_SCHED_DEBUG=y): > > $ grep -H . /proc/sys/kernel/sched_*_granularity_ns > /proc/sys/kernel/sched_min_granularity_ns:2250000 > /proc/sys/kernel/sched_wakeup_granularity_ns:3000000 > > The above numbers were confirmed on the RPi2 (see[2]). With commit > 4cd13c21b207 ("softirq: Let ksoftirqd do its job"), I expect/assume that > softirq processing latency is bounded by the sched_wakeup_granularity_ns, > which with 3 ms is not good enough for their use-case. Note of caution wrt twiddling sched_wakeup_granularity_ns: it must remain < sched_latency_ns/2 else you effectively disable wakeup preemption completely, turning CFS into a tick granularity scheduler. -Mike
Hi, I'm a software developer working in Marvell SoC team. I'm facing kernel panic issue while running raid 5 on sata disks connected to Macchiatobin (Marvell community board with Armada-8040 SoC with 4 ARMv8 cores of CA72) Raid 5 built with Marvell DMA engine and async_tx mechanism (ASYNC_TX_DMA [=y]); the DMA driver (mv_xor_v2) uses a tasklet to clean the done descriptors from the queue. The panic (see below) occurs while building the RAID-5 (mdadm) or while writing/reading to the raid partition. After some debug/bisect/diff, found that patch "softirq: Let ksoftirqd do its job" is problematic patch. - Using v4.14.0 and problematic patch reverted - no timout issue. - Using v4.14.0 (including softirq patch) and the additional fix proposed by Linus - no timeout issue. As others have reported in this thread, the softirq change is causing some regression. Would it be possible to either revert the patch or apply a fix such as the one proposed by Linus ? Below panic message: [ 25.371495] mv_xor_v2 f0400000.xor: dma_sync_wait: timeout! [ 25.377101] Kernel panic - not syncing: async_tx_quiesce: DMA error waiting for transaction [ 25.377101] [ 25.386973] CPU: 0 PID: 1417 Comm: md0_raid5 Not tainted 4.14.0 #16 [ 25.393264] Hardware name: Marvell Armada 8040 DB board (DT) [ 25.398946] Call trace: [ 25.401410] [<ffff000008089310>] dump_backtrace+0x0/0x380 [ 25.406831] [<ffff0000080896a4>] show_stack+0x14/0x20 [ 25.411904] [<ffff00000890fa78>] dump_stack+0x98/0xb8 [ 25.416976] [<ffff0000080c8ef0>] panic+0x118/0x280 [ 25.421788] [<ffff000008386a44>] async_tx_quiesce+0x74/0x78 [ 25.427382] [<ffff000008386ca4>] async_memcpy+0x1a4/0x2a0 [ 25.432806] [<ffff000008747f9c>] async_copy_data.isra.16+0x1b4/0x280 [ 25.439186] [<ffff00000874b6fc>] raid_run_ops+0x514/0x1320 [ 25.444694] [<ffff000008751550>] handle_stripe+0x1040/0x2848 [ 25.450377] [<ffff000008752f98>] handle_active_stripes.isra.28+0x240/0x460 [ 25.457279] [<ffff000008753468>] raid5d+0x2b0/0x450 [ 25.462177] [<ffff00000875ead4>] md_thread+0x104/0x160 [ 25.467338] [<ffff0000080e638c>] kthread+0xfc/0x128 [ 25.472234] [<ffff000008085354>] ret_from_fork+0x10/0x1c [ 25.477571] Kernel Offset: disabled [ 25.481073] CPU features: 0x002000 [ 25.484487] Memory Limit: none [ 25.487556] ---[ end Kernel panic - not syncing: async_tx_quiesce: DMA error waiting for transaction [ 25.487556] Thanks, Hanna
On Tue, Jul 17, 2018 at 4:58 AM Hanna Hawa <hannah@marvell.com> wrote: > > After some debug/bisect/diff, found that patch "softirq: Let ksoftirqd > do its job" is problematic patch. Ok, this thread died down without any resolution. >- Using v4.14.0 (including softirq patch) and the additional fix > proposed by Linus - no timeout issue. Are you talking about the patch that made HI_SOFTIRQ and TASKLET_SOFTIRQ special, and had this: #define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ)) in it? I think I'll just commit the damn thing. It's hacky, but it's simple, and it never got applied because we had smarter suggestions. But the smarter suggestions never ended up being applied either, so.. Linus
Hi Linus, On 07/17/2018 08:09 PM, Linus Torvalds wrote: > On Tue, Jul 17, 2018 at 4:58 AM Hanna Hawa <hannah@marvell.com> wrote: >> >> After some debug/bisect/diff, found that patch "softirq: Let ksoftirqd >> do its job" is problematic patch. > > Ok, this thread died down without any resolution. > >> - Using v4.14.0 (including softirq patch) and the additional fix >> proposed by Linus - no timeout issue. > > Are you talking about the patch that made HI_SOFTIRQ and > TASKLET_SOFTIRQ special, and had this: > > #define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ)) > > in it? yes, exactly.. Link to the patch: https://git.linuxtv.org/mchehab/experimental.git/commit/?h=v4.15%2bmedia%2bdwc2&id=ccf833fd4a5b99c3d3cf2c09c065670f74a230a7 Thanks, Hanna > > I think I'll just commit the damn thing. It's hacky, but it's simple, > and it never got applied because we had smarter suggestions. But the > smarter suggestions never ended up being applied either, so.. > > Linus >
Hi Linus, Em Tue, 17 Jul 2018 10:09:28 -0700 Linus Torvalds <torvalds@linux-foundation.org> escreveu: > On Tue, Jul 17, 2018 at 4:58 AM Hanna Hawa <hannah@marvell.com> wrote: > > > > After some debug/bisect/diff, found that patch "softirq: Let ksoftirqd > > do its job" is problematic patch. > > Ok, this thread died down without any resolution. > > >- Using v4.14.0 (including softirq patch) and the additional fix > > proposed by Linus - no timeout issue. > > Are you talking about the patch that made HI_SOFTIRQ and > TASKLET_SOFTIRQ special, and had this: > > #define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ)) > > in it? > > I think I'll just commit the damn thing. It's hacky, but it's simple, > and it never got applied because we had smarter suggestions. But the > smarter suggestions never ended up being applied either, so.. Yeah, IMHO the best would be to apply your patch[1], c/c stable up to 4.9. Nothing prevents applying a better/smarter solution once we have it. From my side, I can keep testing whatever smart suggestions people propose. Yet, better to have one fix on our hand than two fixes flying around. [1] e. g. https://git.linuxtv.org/mchehab/experimental.git/commit/?h=v4.15%2bmedia%2bdwc2&id=ccf833fd4a5b99c3d3cf2c09c065670f74a230a7 Regards, Mauro
diff --git a/kernel/softirq.c b/kernel/softirq.c index 2f5e87f1bae2..97b080956fea 100644 --- a/kernel/softirq.c +++ b/kernel/softirq.c @@ -79,12 +79,16 @@ static void wakeup_softirqd(void) /* * If ksoftirqd is scheduled, we do not want to process pending softirqs - * right now. Let ksoftirqd handle this at its own rate, to get fairness. + * right now. Let ksoftirqd handle this at its own rate, to get fairness, + * unless we're doing some of the synchronous softirqs. */ -static bool ksoftirqd_running(void) +#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ)) +static bool ksoftirqd_running(unsigned long pending) { struct task_struct *tsk = __this_cpu_read(ksoftirqd); + if (pending & SOFTIRQ_NOW_MASK) + return false; return tsk && (tsk->state == TASK_RUNNING); } @@ -325,7 +329,7 @@ asmlinkage __visible void do_softirq(void) pending = local_softirq_pending(); - if (pending && !ksoftirqd_running()) + if (pending && !ksoftirqd_running(pending)) do_softirq_own_stack(); local_irq_restore(flags); @@ -352,7 +356,7 @@ void irq_enter(void) static inline void invoke_softirq(void) { - if (ksoftirqd_running()) + if (ksoftirqd_running(local_softirq_pending())) return; if (!force_irqthreads) {