Message ID | 1503687956-7110-2-git-send-email-andrew.boyer@dell.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
On Fri, Aug 25, 2017 at 03:05:46PM -0400, Andrew Boyer wrote: > The network stack will call nskb's destructor, rxe_skb_tx_dtor(), if the > packet gets dropped by ip_local_out()/ip6_local_out(). Thus we need to add > the QP ref before output to avoid extra dereferences during network > congestion. This could lead to unwanted destruction of the QP. > > Fix up the skb_out accounting, too. > > Fixes: fda85ce91240 ("IB/rxe: Fix kernel panic from skb destructor") > Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> > Acked-by: Moni Shoua <monis@mellanox.com> > --- > drivers/infiniband/sw/rxe/rxe_net.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/drivers/infiniband/sw/rxe/rxe_net.c b/drivers/infiniband/sw/rxe/rxe_net.c > index 08f3f90..0810f38 100644 > --- a/drivers/infiniband/sw/rxe/rxe_net.c > +++ b/drivers/infiniband/sw/rxe/rxe_net.c > @@ -460,12 +460,17 @@ int rxe_send(struct rxe_dev *rxe, struct rxe_pkt_info *pkt, struct sk_buff *skb) > nskb->destructor = rxe_skb_tx_dtor; > nskb->sk = pkt->qp->sk->sk; > > + rxe_add_ref(pkt->qp); > + atomic_inc(&pkt->qp->skb_out); > + > if (av->network_type == RDMA_NETWORK_IPV4) { > err = ip_local_out(dev_net(skb_dst(skb)->dev), nskb->sk, nskb); > } else if (av->network_type == RDMA_NETWORK_IPV6) { > err = ip6_local_out(dev_net(skb_dst(skb)->dev), nskb->sk, nskb); > } else { > pr_err("Unknown layer 3 protocol: %d\n", av->network_type); > + atomic_dec(&pkt->qp->skb_out); > + rxe_drop_ref(pkt->qp); > kfree_skb(nskb); > return -EINVAL; > } > @@ -475,10 +480,7 @@ int rxe_send(struct rxe_dev *rxe, struct rxe_pkt_info *pkt, struct sk_buff *skb) > return -EAGAIN; > } Shouldn't we also dec and drop ref in case that net_xmit_eval returns error? > > - rxe_add_ref(pkt->qp); > - atomic_inc(&pkt->qp->skb_out); > kfree_skb(skb); > - > return 0; > } > > -- > 1.8.3.1 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 8/27/17, 8:01 AM, "linux-rdma-owner@vger.kernel.org on behalf of Yuval Shaia" <linux-rdma-owner@vger.kernel.org on behalf of yuval.shaia@oracle.com> wrote: >On Fri, Aug 25, 2017 at 03:05:46PM -0400, Andrew Boyer wrote: >> The network stack will call nskb's destructor, rxe_skb_tx_dtor(), if the >> packet gets dropped by ip_local_out()/ip6_local_out(). Thus we need to >>add >> the QP ref before output to avoid extra dereferences during network >> congestion. This could lead to unwanted destruction of the QP. >> >> Fix up the skb_out accounting, too. >> >> Fixes: fda85ce91240 ("IB/rxe: Fix kernel panic from skb destructor") >> Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> >> Acked-by: Moni Shoua <monis@mellanox.com> >> --- >> drivers/infiniband/sw/rxe/rxe_net.c | 8 +++++--- >> 1 file changed, 5 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/infiniband/sw/rxe/rxe_net.c >>b/drivers/infiniband/sw/rxe/rxe_net.c >> index 08f3f90..0810f38 100644 >> --- a/drivers/infiniband/sw/rxe/rxe_net.c >> +++ b/drivers/infiniband/sw/rxe/rxe_net.c >> @@ -460,12 +460,17 @@ int rxe_send(struct rxe_dev *rxe, struct >>rxe_pkt_info *pkt, struct sk_buff *skb) >> nskb->destructor = rxe_skb_tx_dtor; >> nskb->sk = pkt->qp->sk->sk; >> >> + rxe_add_ref(pkt->qp); >> + atomic_inc(&pkt->qp->skb_out); >> + >> if (av->network_type == RDMA_NETWORK_IPV4) { >> err = ip_local_out(dev_net(skb_dst(skb)->dev), nskb->sk, nskb); >> } else if (av->network_type == RDMA_NETWORK_IPV6) { >> err = ip6_local_out(dev_net(skb_dst(skb)->dev), nskb->sk, nskb); >> } else { >> pr_err("Unknown layer 3 protocol: %d\n", av->network_type); >> + atomic_dec(&pkt->qp->skb_out); >> + rxe_drop_ref(pkt->qp); >> kfree_skb(nskb); >> return -EINVAL; >> } >> @@ -475,10 +480,7 @@ int rxe_send(struct rxe_dev *rxe, struct >>rxe_pkt_info *pkt, struct sk_buff *skb) >> return -EAGAIN; >> } > >Shouldn't we also dec and drop ref in case that net_xmit_eval returns >error? That¹s the catch. net_xmit_eval() is just looking for error codes; it¹s not doing the actual transmit. The transmit already started back in ip_local_out()/ip6_local_out(), and if those end up failing the destructor will be called. Any cleanup we do in this error path would be duplicated in the destructor. -Andrew > >> >> - rxe_add_ref(pkt->qp); >> - atomic_inc(&pkt->qp->skb_out); >> kfree_skb(skb); >> - >> return 0; >> } >> >> -- >> 1.8.3.1 >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >-- >To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/infiniband/sw/rxe/rxe_net.c b/drivers/infiniband/sw/rxe/rxe_net.c index 08f3f90..0810f38 100644 --- a/drivers/infiniband/sw/rxe/rxe_net.c +++ b/drivers/infiniband/sw/rxe/rxe_net.c @@ -460,12 +460,17 @@ int rxe_send(struct rxe_dev *rxe, struct rxe_pkt_info *pkt, struct sk_buff *skb) nskb->destructor = rxe_skb_tx_dtor; nskb->sk = pkt->qp->sk->sk; + rxe_add_ref(pkt->qp); + atomic_inc(&pkt->qp->skb_out); + if (av->network_type == RDMA_NETWORK_IPV4) { err = ip_local_out(dev_net(skb_dst(skb)->dev), nskb->sk, nskb); } else if (av->network_type == RDMA_NETWORK_IPV6) { err = ip6_local_out(dev_net(skb_dst(skb)->dev), nskb->sk, nskb); } else { pr_err("Unknown layer 3 protocol: %d\n", av->network_type); + atomic_dec(&pkt->qp->skb_out); + rxe_drop_ref(pkt->qp); kfree_skb(nskb); return -EINVAL; } @@ -475,10 +480,7 @@ int rxe_send(struct rxe_dev *rxe, struct rxe_pkt_info *pkt, struct sk_buff *skb) return -EAGAIN; } - rxe_add_ref(pkt->qp); - atomic_inc(&pkt->qp->skb_out); kfree_skb(skb); - return 0; }