diff mbox

[rdma-next,1/2] IB/rxe: Fix kernel panic from skb destructor

Message ID 20170622141000.9899-1-leon@kernel.org (mailing list archive)
State Accepted
Headers show

Commit Message

Leon Romanovsky June 22, 2017, 2:09 p.m. UTC
From: Yonatan Cohen <yonatanc@mellanox.com>

In the time between rxe_send has finished and skb destructor
called, the QP's ref count might be 0, leading to a possible
QP destruction. This will lead to a kernel panic when the destructor
dereferences the QP.

The operation of incrementing QP ref count at rxe_send and decrementing
from skb destructor will prevent this crash.

BUG: unable to handle kernel NULL pointer dereference at 000000000000072c
IP: [<ffffffffa05df765>] rxe_skb_tx_dtor+0x15/0x50 [rdma_rxe]
PGD 0 [16240.211178]
Oops: 0002 [#1] SMP
CPU: 3 PID: 0 Comm: swapper/3 Tainted: G           OE   4.9.0-mlnx #1
Hardware name: Red Hat KVM, BIOS Bochs 01/01/2011
task: ffff88042d6b1480 task.stack: ffffc90001904000
RIP: 0010:[<ffffffffa05df765>]  [<ffffffffa05df765>] rxe_skb_tx_dtor+0x15/0x50 [rdma_rxe]
RSP: 0018:ffff88043fcc3df0  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff880429684700 RCX: ffff88042d248200
RDX: 00000000ffffffff RSI: 00000000fffffe01 RDI: ffff880429684700
RBP: ffff88043fcc3e00 R08: ffff88043fcda240 R09: 00000000ff2d1de6
R10: 0000000000000000 R11: 00000000f49cf6fe R12: ffff880429684700
R13: ffffffff81893f96 R14: ffffffff817d66f0 R15: ffff880427f74200
FS:  0000000000000000(0000) GS:ffff88043fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000072c CR3: 000000041d3df000 CR4: 00000000000006e0
Stack:
 ffffffff817b29cf ffff880429684700 ffff88043fcc3e18 ffffffff817b42c2
 ffff880429684700 ffff88043fcc3e40 ffffffff817b4332 ffff880429684700
 ffff880427f74238 ffff880427f74228 ffff88043fcc3e58 ffffffff81893f96
Call Trace:
 <IRQ> [16240.336345]  [<ffffffff817b29cf>] ? skb_release_head_state+0x4f/0xb0
 [<ffffffff817b42c2>] skb_release_all+0x12/0x30
 [<ffffffff817b4332>] kfree_skb+0x32/0x90
 [<ffffffff81893f96>] ndisc_error_report+0x36/0x40
 [<ffffffff817d4de1>] neigh_invalidate+0x81/0xf0
 [<ffffffff817d68f7>] neigh_timer_handler+0x207/0x2b0
 [<ffffffff81109295>] call_timer_fn+0x35/0x120
 [<ffffffff81109db7>] run_timer_softirq+0x1d7/0x460
 [<ffffffff8106155e>] ? kvm_sched_clock_read+0x1e/0x30
 [<ffffffff810366b9>] ? sched_clock+0x9/0x10
 [<ffffffff810cfed2>] ? sched_clock_cpu+0x72/0xa0
 [<ffffffff818dd537>] __do_softirq+0xd7/0x289
 [<ffffffff810a6c95>] irq_exit+0xb5/0xc0
 [<ffffffff818dd372>] smp_apic_timer_interrupt+0x42/0x50
 [<ffffffff818dc682>] apic_timer_interrupt+0x82/0x90
 <EOI> [16240.395776]  [<ffffffff818da156>] ? native_safe_halt+0x6/0x10
 [<ffffffff818d9e6e>] default_idle+0x1e/0xd0
 [<ffffffff8103797f>] arch_cpu_idle+0xf/0x20
 [<ffffffff818da2c5>] default_idle_call+0x35/0x40
 [<ffffffff810e3eb5>] cpu_startup_entry+0x185/0x210
 [<ffffffff81050433>] start_secondary+0x103/0x130
RIP  [<ffffffffa05df765>] rxe_skb_tx_dtor+0x15/0x50 [rdma_rxe]

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
---
 drivers/infiniband/sw/rxe/rxe_net.c | 3 +++
 1 file changed, 3 insertions(+)

--
2.13.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Johannes Thumshirn June 22, 2017, 2:18 p.m. UTC | #1
On Thu, Jun 22, 2017 at 05:09:59PM +0300, Leon Romanovsky wrote:
> From: Yonatan Cohen <yonatanc@mellanox.com>
> 
> In the time between rxe_send has finished and skb destructor
> called, the QP's ref count might be 0, leading to a possible
> QP destruction. This will lead to a kernel panic when the destructor
> dereferences the QP.
> 
> The operation of incrementing QP ref count at rxe_send and decrementing
> from skb destructor will prevent this crash.
> 
> BUG: unable to handle kernel NULL pointer dereference at 000000000000072c
> IP: [<ffffffffa05df765>] rxe_skb_tx_dtor+0x15/0x50 [rdma_rxe]
> PGD 0 [16240.211178]
> Oops: 0002 [#1] SMP
> CPU: 3 PID: 0 Comm: swapper/3 Tainted: G           OE   4.9.0-mlnx #1
> Hardware name: Red Hat KVM, BIOS Bochs 01/01/2011
> task: ffff88042d6b1480 task.stack: ffffc90001904000
> RIP: 0010:[<ffffffffa05df765>]  [<ffffffffa05df765>] rxe_skb_tx_dtor+0x15/0x50 [rdma_rxe]
> RSP: 0018:ffff88043fcc3df0  EFLAGS: 00010246
> RAX: 0000000000000000 RBX: ffff880429684700 RCX: ffff88042d248200
> RDX: 00000000ffffffff RSI: 00000000fffffe01 RDI: ffff880429684700
> RBP: ffff88043fcc3e00 R08: ffff88043fcda240 R09: 00000000ff2d1de6
> R10: 0000000000000000 R11: 00000000f49cf6fe R12: ffff880429684700
> R13: ffffffff81893f96 R14: ffffffff817d66f0 R15: ffff880427f74200
> FS:  0000000000000000(0000) GS:ffff88043fcc0000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000000000000072c CR3: 000000041d3df000 CR4: 00000000000006e0
> Stack:
>  ffffffff817b29cf ffff880429684700 ffff88043fcc3e18 ffffffff817b42c2
>  ffff880429684700 ffff88043fcc3e40 ffffffff817b4332 ffff880429684700
>  ffff880427f74238 ffff880427f74228 ffff88043fcc3e58 ffffffff81893f96
> Call Trace:
>  <IRQ> [16240.336345]  [<ffffffff817b29cf>] ? skb_release_head_state+0x4f/0xb0
>  [<ffffffff817b42c2>] skb_release_all+0x12/0x30
>  [<ffffffff817b4332>] kfree_skb+0x32/0x90
>  [<ffffffff81893f96>] ndisc_error_report+0x36/0x40
>  [<ffffffff817d4de1>] neigh_invalidate+0x81/0xf0
>  [<ffffffff817d68f7>] neigh_timer_handler+0x207/0x2b0
>  [<ffffffff81109295>] call_timer_fn+0x35/0x120
>  [<ffffffff81109db7>] run_timer_softirq+0x1d7/0x460
>  [<ffffffff8106155e>] ? kvm_sched_clock_read+0x1e/0x30
>  [<ffffffff810366b9>] ? sched_clock+0x9/0x10
>  [<ffffffff810cfed2>] ? sched_clock_cpu+0x72/0xa0
>  [<ffffffff818dd537>] __do_softirq+0xd7/0x289
>  [<ffffffff810a6c95>] irq_exit+0xb5/0xc0
>  [<ffffffff818dd372>] smp_apic_timer_interrupt+0x42/0x50
>  [<ffffffff818dc682>] apic_timer_interrupt+0x82/0x90
>  <EOI> [16240.395776]  [<ffffffff818da156>] ? native_safe_halt+0x6/0x10
>  [<ffffffff818d9e6e>] default_idle+0x1e/0xd0
>  [<ffffffff8103797f>] arch_cpu_idle+0xf/0x20
>  [<ffffffff818da2c5>] default_idle_call+0x35/0x40
>  [<ffffffff810e3eb5>] cpu_startup_entry+0x185/0x210
>  [<ffffffff81050433>] start_secondary+0x103/0x130
> RIP  [<ffffffffa05df765>] rxe_skb_tx_dtor+0x15/0x50 [rdma_rxe]
> 
> Fixes: 8700e3e7c485 ("Soft RoCE driver")
> Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
> Reviewed-by: Moni Shoua <monis@mellanox.com>
> Signed-off-by: Leon Romanovsky <leon@kernel.org>
> ---

Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Doug Ledford July 22, 2017, 5:14 p.m. UTC | #2
On 6/22/2017 10:09 AM, Leon Romanovsky wrote:
> From: Yonatan Cohen <yonatanc@mellanox.com>
> 
> In the time between rxe_send has finished and skb destructor
> called, the QP's ref count might be 0, leading to a possible
> QP destruction. This will lead to a kernel panic when the destructor
> dereferences the QP.
> 
> The operation of incrementing QP ref count at rxe_send and decrementing
> from skb destructor will prevent this crash.

This series has been applied to 4.13-rc, thanks.

FWIW Leon, if you have a series, even if it's a short 2 patch series, I
prefer a cover letter.  It doesn't need to be long, and can just be
something like "I have two bug fixes for rxe in this series", but I
dislike responding to a patch and saying I applied a series.
Leon Romanovsky July 23, 2017, 5:21 a.m. UTC | #3
On Sat, Jul 22, 2017 at 01:14:26PM -0400, Doug Ledford wrote:
> On 6/22/2017 10:09 AM, Leon Romanovsky wrote:
> > From: Yonatan Cohen <yonatanc@mellanox.com>
> >
> > In the time between rxe_send has finished and skb destructor
> > called, the QP's ref count might be 0, leading to a possible
> > QP destruction. This will lead to a kernel panic when the destructor
> > dereferences the QP.
> >
> > The operation of incrementing QP ref count at rxe_send and decrementing
> > from skb destructor will prevent this crash.
>
> This series has been applied to 4.13-rc, thanks.
>
> FWIW Leon, if you have a series, even if it's a short 2 patch series, I
> prefer a cover letter.  It doesn't need to be long, and can just be
> something like "I have two bug fixes for rxe in this series", but I
> dislike responding to a patch and saying I applied a series.

No problem, I'll do.

Thanks

>
> --
> Doug Ledford <dledford@redhat.com>
>     GPG Key ID: B826A3330E572FDD
>     Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD
>
diff mbox

Patch

diff --git a/drivers/infiniband/sw/rxe/rxe_net.c b/drivers/infiniband/sw/rxe/rxe_net.c
index c3a140ed4df2..08f3f90d2912 100644
--- a/drivers/infiniband/sw/rxe/rxe_net.c
+++ b/drivers/infiniband/sw/rxe/rxe_net.c
@@ -441,6 +441,8 @@  static void rxe_skb_tx_dtor(struct sk_buff *skb)
 	if (unlikely(qp->need_req_skb &&
 		     skb_out < RXE_INFLIGHT_SKBS_PER_QP_LOW))
 		rxe_run_task(&qp->req.task, 1);
+
+	rxe_drop_ref(qp);
 }

 int rxe_send(struct rxe_dev *rxe, struct rxe_pkt_info *pkt, struct sk_buff *skb)
@@ -473,6 +475,7 @@  int rxe_send(struct rxe_dev *rxe, struct rxe_pkt_info *pkt, struct sk_buff *skb)
 		return -EAGAIN;
 	}

+	rxe_add_ref(pkt->qp);
 	atomic_inc(&pkt->qp->skb_out);
 	kfree_skb(skb);