From patchwork Tue Oct 3 15:49:20 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arnaldo Carvalho de Melo X-Patchwork-Id: 9983155 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 5EA9860375 for ; Tue, 3 Oct 2017 15:49:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4D67E2882E for ; Tue, 3 Oct 2017 15:49:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3FC7F28A07; Tue, 3 Oct 2017 15:49:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8DFF92882E for ; Tue, 3 Oct 2017 15:49:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752490AbdJCPtq (ORCPT ); Tue, 3 Oct 2017 11:49:46 -0400 Received: from mail.kernel.org ([198.145.29.99]:43638 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752440AbdJCPto (ORCPT ); Tue, 3 Oct 2017 11:49:44 -0400 Received: from jouet.infradead.org (unknown [190.15.121.82]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 96158218C9; Tue, 3 Oct 2017 15:49:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 96158218C9 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=acme@kernel.org From: Arnaldo Carvalho de Melo To: bigeasy@linutronix.de Cc: linux-rt-users@vger.kernel.org, linux-kernel@vger.kernel.org, Arnaldo Carvalho de Melo , Clark Williams , Dean Luick , Dennis Dalessandro , Doug Ledford , Julia Cartwright , Kaike Wan , Leon Romanovsky , linux-rdma@vger.kernel.org, Peter Zijlstra , Sebastian Andrzej Siewior , Sebastian Sanchez , Steven Rostedt , Thomas Gleixner Subject: [PATCH 2/2] IB/hfi1: Handle packets in the theaded handler only Date: Tue, 3 Oct 2017 12:49:20 -0300 Message-Id: <20171003154920.31566-3-acme@kernel.org> X-Mailer: git-send-email 2.13.6 In-Reply-To: <20171003154920.31566-1-acme@kernel.org> References: <20171003154920.31566-1-acme@kernel.org> Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Arnaldo Carvalho de Melo The hfi1 driver calls request_threaded_irq with two parameters: handler = receive_context_interrupt; thread = receive_context_thread; request_threaded_irq(me->msix.vector, handler, thread, 0, me->name, arg); And tries to process packets on the hard irq one, receive_context_interrupt(), only waking up the thread (returning IRQ_WAKE_THREAD) when some threshold is crossed in the number of packets available in the NIC, trying to balance latency and bandwidth. But in a CONFIG_PREEMPT_RT_FULL kernel it ends up calling spin locks from the hard irq handler (receive_context_interrupt) which causes BUGs like this: [ 1002.740581] hfi1 0000:21:00.0: hfi1_0: set_link_state: current ARMED, new ACTIVE [ 1002.740583] hfi1 0000:21:00.0: hfi1_0: logical state changed to PORT_ACTIVE (0x4) [ 1002.740599] hfi1 0000:21:00.0: hfi1_0: send_idle_message: sending idle message 0x203 [ 1002.741873] hfi1 0000:21:00.0: hfi1_0: read_idle_message: read idle message 0x203 [ 1002.741874] hfi1 0000:21:00.0: hfi1_0: handle_sma_message: SMA message 0x2 [ 1002.741923] hfi1 0000:21:00.0: hfi1_0: Switching to NO_DMA_RTAIL [ 1004.744192] IPv6: ADDRCONF(NETDEV_CHANGE): hfi1_opa0: link becomes ready [ 1167.907754] ------------[ cut here ]------------ [ 1167.907756] kernel BUG at kernel/rtmutex.c:902! [ 1167.907758] invalid opcode: 0000 [#1] PREEMPT SMP [ 1167.907805] CPU: 10 PID: 1505 Comm: hfi1_cq0 Not tainted 3.10.0-708.rt56.635.test.el7.x86_64 #1 [ 1167.907823] Call Trace: [ 1167.907826] [ 1167.907850] [] ? hfi1_rvt_get_rwqe+0x141/0x400 [hfi1] [ 1167.907852] [] rt_spin_lock+0x25/0x30 [ 1167.907856] [] queue_kthread_work+0x24/0x60 [ 1167.907861] [] rvt_cq_enter+0x17b/0x250 [rdmavt] [ 1167.907869] [] hfi1_rc_rcv+0x67a/0x1260 [hfi1] [ 1167.907878] [] hfi1_ib_rcv+0x2c8/0x400 [hfi1] [ 1167.907886] [] process_receive_ib+0x6c/0x150 [hfi1] [ 1167.907888] [] ? enqueue_pushable_task+0x6d/0x90 [ 1167.907895] [] handle_receive_interrupt_nodma_rtail+0x161/0x310 [hfi1] [ 1167.907914] [] receive_context_interrupt+0x53/0x390 [hfi1] [ 1167.907917] [] __handle_irq_event_percpu+0x56/0x240 [ 1167.907919] [] ? rt_spin_lock+0x16/0x30 [ 1167.907920] [] handle_irq_event_percpu+0x49/0xa0 [ 1167.907922] [] handle_irq_event+0x78/0xb0 [ 1167.907924] [] handle_edge_irq+0x99/0x1a0 [ 1167.907926] [] handle_irq+0xbb/0x150 [ 1167.907929] [] do_IRQ+0x4d/0xe0 [ 1167.907931] [] common_interrupt+0x6d/0x6d [ 1167.907931] [ 1167.907932] [] ? rt_spin_lock+0x16/0x30 [ 1167.907934] [] ? kthread_worker_fn+0xb5/0x170 [ 1167.907935] [] ? flush_kthread_work+0x130/0x130 [ 1167.907937] [] kthread+0xcf/0xe0 [ 1167.907938] [] ? kthread_worker_fn+0x170/0x170 [ 1167.907940] [] ret_from_fork+0x58/0x90 [ 1167.907941] [] ? kthread_worker_fn+0x170/0x170 [ 1167.907951] Code: 90 e8 eb f0 ff ff e9 d4 fd ff ff 66 0f 1f 44 00 00 e8 db f0 ff ff eb b6 0f 0b 0f 1f 80 00 00 00 00 e8 0b f7 a3 ff e8 46 86 9c ff <0f> 0b 0f 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 57 65 4c 8b 3c [ 1167.907952] RIP [] rt_spin_lock_slowlock+0x34a/0x350 [ 1167.907952] RSP To get it to work on RT just keep the prologue that clears the chip receive interrupt and immediately return IRQ_WAKE_THREAD, deferring all packet processing, with its locking, to the thread. With this test systems are able to pass traffic over this hardware using a CONFIG_PREEMPT_RT_FULL patched kernel without triggering these BUGs. Cc: Clark Williams Cc: Dean Luick Cc: Dennis Dalessandro Cc: Doug Ledford Cc: Julia Cartwright Cc: Kaike Wan Cc: Leon Romanovsky Cc: linux-rdma@vger.kernel.org Cc: Peter Zijlstra Cc: Sebastian Andrzej Siewior Cc: Sebastian Sanchez Cc: Steven Rostedt Cc: Thomas Gleixner Signed-off-by: Arnaldo Carvalho de Melo --- drivers/infiniband/hw/hfi1/chip.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c index 121a4c920f1b..733a00d8ea4c 100644 --- a/drivers/infiniband/hw/hfi1/chip.c +++ b/drivers/infiniband/hw/hfi1/chip.c @@ -8226,15 +8226,17 @@ static irqreturn_t receive_context_interrupt(int irq, void *data) { struct hfi1_ctxtdata *rcd = data; struct hfi1_devdata *dd = rcd->dd; - int disposition; - int present; trace_hfi1_receive_interrupt(dd, rcd->ctxt); this_cpu_inc(*dd->int_counter); aspm_ctx_disable(rcd); +#ifdef CONFIG_PREEMPT_RT_FULL + return IRQ_WAKE_THREAD; +#else +{ /* receive interrupt remains blocked while processing packets */ - disposition = rcd->do_interrupt(rcd, 0); + int disposition = rcd->do_interrupt(rcd, 0), present; /* * Too many packets were seen while processing packets in this @@ -8257,6 +8259,8 @@ static irqreturn_t receive_context_interrupt(int irq, void *data) return IRQ_HANDLED; } +#endif +} /* * Receive packet thread handler. This expects to be invoked with the