Message ID | 1549733310-14966-1-git-send-email-pkaustub@cisco.com (mailing list archive) |
---|---|
State | Accepted |
Delegated to: | Jason Gunthorpe |
Headers | show |
Series | [for-next,v4] IB/usnic: fix deadlock | expand |
On Sat, Feb 09, 2019 at 09:28:30AM -0800, Parvi Kaustubhi wrote: > There is a dead lock in usnic ib_register and netdev_notify > path. > > usnic_ib_discover_pf() > | mutex_lock(&usnic_ib_ibdev_list_lock); > | usnic_ib_device_add(); > | ib_register_device() > | usnic_ib_query_port() > | mutex_lock(&us_ibdev->usdev_lock); > | ib_get_eth_speed() > | rtnl_lock() > > order of lock: &usnic_ib_ibdev_list_lock -> usdev_lock -> rtnl_lock > > rtnl_lock() > | usnic_ib_netdevice_event() > | mutex_lock(&usnic_ib_ibdev_list_lock); > > order of lock: rtnl_lock -> &usnic_ib_ibdev_list_lock > > Solution is to use ib_device_get_by_netdev() to lookup ib_dev while > handling netdev/ inet events. > > Signed-off-by: Parvi Kaustubhi <pkaustub@cisco.com> > Reviewed-by: Govindarajulu Varadarajan <gvaradar@cisco.com> > Reviewed-by: Tanmay Inamdar <tinamdar@cisco.com> > Changelog: > > v3->v4 > * Added missing ib_device_put* calls. > * Fixed wonky control flow > > v2->v3: > * Jason: drivers should not hold any locks while calling ib_unregister_device() > * Jason: use https://github.com/jgunthorpe/linux/commits/device_locking_cleanup > > v1->v2: > * Have notifier blocks in usnic_ib_dev instead of using workqueue to defer > event handling. > drivers/infiniband/hw/usnic/usnic_ib_main.c | 30 ++++++++++++++--------------- > 1 file changed, 14 insertions(+), 16 deletions(-) I added the hunks to use setnedev from my as-yet-unsent patch and applies this to for-next. I assume you tested on top of that. You should probably test the final result. Thanks, Jason
Yes, I was testing on top of commits in https://github.com/jgunthorpe/linux/commits/device_locking_cleanup I have tested the latest (with the updated patch) on branch wip/jgg-for-next Thanks, Parvi > On Feb 18, 2019, at 8:38 PM, Jason Gunthorpe <jgg@ziepe.ca> wrote: > > On Sat, Feb 09, 2019 at 09:28:30AM -0800, Parvi Kaustubhi wrote: >> There is a dead lock in usnic ib_register and netdev_notify >> path. >> >> usnic_ib_discover_pf() >> | mutex_lock(&usnic_ib_ibdev_list_lock); >> | usnic_ib_device_add(); >> | ib_register_device() >> | usnic_ib_query_port() >> | mutex_lock(&us_ibdev->usdev_lock); >> | ib_get_eth_speed() >> | rtnl_lock() >> >> order of lock: &usnic_ib_ibdev_list_lock -> usdev_lock -> rtnl_lock >> >> rtnl_lock() >> | usnic_ib_netdevice_event() >> | mutex_lock(&usnic_ib_ibdev_list_lock); >> >> order of lock: rtnl_lock -> &usnic_ib_ibdev_list_lock >> >> Solution is to use ib_device_get_by_netdev() to lookup ib_dev while >> handling netdev/ inet events. >> >> Signed-off-by: Parvi Kaustubhi <pkaustub@cisco.com> >> Reviewed-by: Govindarajulu Varadarajan <gvaradar@cisco.com> >> Reviewed-by: Tanmay Inamdar <tinamdar@cisco.com> >> Changelog: >> >> v3->v4 >> * Added missing ib_device_put* calls. >> * Fixed wonky control flow >> >> v2->v3: >> * Jason: drivers should not hold any locks while calling ib_unregister_device() >> * Jason: use https://github.com/jgunthorpe/linux/commits/device_locking_cleanup >> >> v1->v2: >> * Have notifier blocks in usnic_ib_dev instead of using workqueue to defer >> event handling. >> drivers/infiniband/hw/usnic/usnic_ib_main.c | 30 ++++++++++++++--------------- >> 1 file changed, 14 insertions(+), 16 deletions(-) > > I added the hunks to use setnedev from my as-yet-unsent patch and > applies this to for-next. I assume you tested on top of that. You > should probably test the final result. > > Thanks, > Jason
diff --git a/drivers/infiniband/hw/usnic/usnic_ib_main.c b/drivers/infiniband/hw/usnic/usnic_ib_main.c index 9529a08..d0ddbec 100644 --- a/drivers/infiniband/hw/usnic/usnic_ib_main.c +++ b/drivers/infiniband/hw/usnic/usnic_ib_main.c @@ -216,18 +216,17 @@ static int usnic_ib_netdevice_event(struct notifier_block *notifier, unsigned long event, void *ptr) { struct usnic_ib_dev *us_ibdev; + struct ib_device *ibdev; struct net_device *netdev = netdev_notifier_info_to_dev(ptr); - mutex_lock(&usnic_ib_ibdev_list_lock); - list_for_each_entry(us_ibdev, &usnic_ib_ibdev_list, ib_dev_link) { - if (us_ibdev->netdev == netdev) { - usnic_ib_handle_usdev_event(us_ibdev, event); - break; - } - } - mutex_unlock(&usnic_ib_ibdev_list_lock); + ibdev = ib_device_get_by_netdev(netdev, RDMA_DRIVER_USNIC); + if (!ibdev) + return NOTIFY_DONE; + us_ibdev = container_of(ibdev, struct usnic_ib_dev, ib_dev); + usnic_ib_handle_usdev_event(us_ibdev, event); + ib_device_put(ibdev); return NOTIFY_DONE; } @@ -282,16 +281,15 @@ static int usnic_ib_inetaddr_event(struct notifier_block *notifier, struct usnic_ib_dev *us_ibdev; struct in_ifaddr *ifa = ptr; struct net_device *netdev = ifa->ifa_dev->dev; + struct ib_device *ibdev; - mutex_lock(&usnic_ib_ibdev_list_lock); - list_for_each_entry(us_ibdev, &usnic_ib_ibdev_list, ib_dev_link) { - if (us_ibdev->netdev == netdev) { - usnic_ib_handle_inet_event(us_ibdev, event, ptr); - break; - } - } - mutex_unlock(&usnic_ib_ibdev_list_lock); + ibdev = ib_device_get_by_netdev(netdev, RDMA_DRIVER_USNIC); + if (!ibdev) + return NOTIFY_DONE; + us_ibdev = container_of(ibdev, struct usnic_ib_dev, ib_dev); + usnic_ib_handle_inet_event(us_ibdev, event, ptr); + ib_device_put(ibdev); return NOTIFY_DONE; } static struct notifier_block usnic_ib_inetaddr_notifier = {