Message ID | 20230627105209.15163-1-dg573847474@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net/802/garp: fix potential deadlock on &app->lock | expand |
On Tue, Jun 27, 2023 at 12:52 PM Chengfeng Ye <dg573847474@gmail.com> wrote: > > As &app->lock is also acquired by the timer garp_join_timer() which > which executes under soft-irq context, code executing under process > context should disable irq before acquiring the lock, otherwise > deadlock could happen if the process context hold the lock then > preempt by the interruption. > > garp_pdu_rcv() is one such function that acquires &app->lock, but I > am not sure whether it is called with irq disable outside thus the > patch could be false. > > Possible deadlock scenario: > garp_pdu_rcv() > -> spin_lock(&app->lock) > <timer interrupt> This can not happen. RX handlers are called from BH context, and rcu_read_lock() See net/core/dev.c, deliver_skb() and netif_receive_skb() > -> garp_join_timer() > -> spin_lock(&app->lock) > > This flaw was found using an experimental static analysis tool we are > developing for irq-related deadlock. > > The tentative patch fix the potential deadlock by spin_lock_irqsave(), > or it should be fixed with spin_lock_bh() if it is a real bug? I am > not very sure. I guess more work is needed at your side :)
I should have noticed that _rcv suffix mostly denotes RX handlers, thanks much for pointing that out! Best regards, Chengfeng
diff --git a/net/802/garp.c b/net/802/garp.c index ab24b21fbb49..acc6f2f847a6 100644 --- a/net/802/garp.c +++ b/net/802/garp.c @@ -515,6 +515,7 @@ static void garp_pdu_rcv(const struct stp_proto *proto, struct sk_buff *skb, struct garp_port *port; struct garp_applicant *app; const struct garp_pdu_hdr *gp; + unsigned long flags; port = rcu_dereference(dev->garp_port); if (!port) @@ -530,14 +531,14 @@ static void garp_pdu_rcv(const struct stp_proto *proto, struct sk_buff *skb, goto err; skb_pull(skb, sizeof(*gp)); - spin_lock(&app->lock); + spin_lock_irqsave(&app->lock, flags); while (skb->len > 0) { if (garp_pdu_parse_msg(app, skb) < 0) break; if (garp_pdu_parse_end_mark(skb) < 0) break; } - spin_unlock(&app->lock); + spin_unlock_irqrestore(&app->lock, flags); err: kfree_skb(skb); }
As &app->lock is also acquired by the timer garp_join_timer() which which executes under soft-irq context, code executing under process context should disable irq before acquiring the lock, otherwise deadlock could happen if the process context hold the lock then preempt by the interruption. garp_pdu_rcv() is one such function that acquires &app->lock, but I am not sure whether it is called with irq disable outside thus the patch could be false. Possible deadlock scenario: garp_pdu_rcv() -> spin_lock(&app->lock) <timer interrupt> -> garp_join_timer() -> spin_lock(&app->lock) This flaw was found using an experimental static analysis tool we are developing for irq-related deadlock. The tentative patch fix the potential deadlock by spin_lock_irqsave(), or it should be fixed with spin_lock_bh() if it is a real bug? I am not very sure. Signed-off-by: Chengfeng Ye <dg573847474@gmail.com> --- net/802/garp.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)