mbox series

[v2,net-next,00/23] net: add preliminary netdev refcount tracking

Message ID 20211203024640.1180745-1-eric.dumazet@gmail.com (mailing list archive)
Headers show
Series net: add preliminary netdev refcount tracking | expand

Message

Eric Dumazet Dec. 3, 2021, 2:46 a.m. UTC
From: Eric Dumazet <edumazet@google.com>

Two first patches add a generic infrastructure, that will be used
to get tracking of refcount increments/decrements.

The general idea is to be able to precisely pair each decrement with
a corresponding prior increment. Both share a cookie, basically
a pointer to private data storing stack traces.

The third place adds dev_hold_track() and dev_put_track() helpers
(CONFIG_NET_DEV_REFCNT_TRACKER)

Then a series of 20 patches converts some dev_hold()/dev_put()
pairs to new hepers : dev_hold_track() and dev_put_track().

Hopefully this will be used by developpers and syzbot to
root cause bugs that cause netdevice dismantles freezes.

With CONFIG_PCPU_DEV_REFCNT=n option, we were able to detect
some class of bugs, but too late (when too many dev_put()
were happening).

v2: added four additional patches,
    added netdev_tracker_alloc() and netdev_tracker_free()
    addressed build error (kernel bots),
    use GFP_ATOMIC in test_ref_tracker_timer_func()

Eric Dumazet (23):
  lib: add reference counting tracking infrastructure
  lib: add tests for reference tracker
  net: add dev_hold_track() and dev_put_track() helpers
  net: add net device refcount tracker to struct netdev_rx_queue
  net: add net device refcount tracker to struct netdev_queue
  net: add net device refcount tracker to ethtool_phys_id()
  net: add net device refcount tracker to dev_ifsioc()
  drop_monitor: add net device refcount tracker
  net: dst: add net device refcount tracking to dst_entry
  ipv6: add net device refcount tracker to rt6_probe_deferred()
  sit: add net device refcount tracking to ip_tunnel
  ipv6: add net device refcount tracker to struct ip6_tnl
  net: add net device refcount tracker to struct neighbour
  net: add net device refcount tracker to struct pneigh_entry
  net: add net device refcount tracker to struct neigh_parms
  net: add net device refcount tracker to struct netdev_adjacent
  ipv6: add net device refcount tracker to struct inet6_dev
  ipv4: add net device refcount tracker to struct in_device
  net/sched: add net device refcount tracker to struct Qdisc
  net: linkwatch: add net device refcount tracker
  net: failover: add net device refcount tracker
  ipmr, ip6mr: add net device refcount tracker to struct vif_device
  netpoll: add net device refcount tracker to struct netpoll

 drivers/net/netconsole.c    |   2 +-
 include/linux/inetdevice.h  |   2 +
 include/linux/mroute_base.h |   1 +
 include/linux/netdevice.h   |  66 +++++++++++++++++
 include/linux/netpoll.h     |   1 +
 include/linux/ref_tracker.h |  73 +++++++++++++++++++
 include/net/devlink.h       |   3 +
 include/net/dst.h           |   1 +
 include/net/failover.h      |   1 +
 include/net/if_inet6.h      |   1 +
 include/net/ip6_tunnel.h    |   1 +
 include/net/ip_tunnels.h    |   3 +
 include/net/neighbour.h     |   3 +
 include/net/sch_generic.h   |   2 +-
 lib/Kconfig                 |   5 ++
 lib/Kconfig.debug           |  10 +++
 lib/Makefile                |   4 +-
 lib/ref_tracker.c           | 140 ++++++++++++++++++++++++++++++++++++
 lib/test_ref_tracker.c      | 115 +++++++++++++++++++++++++++++
 net/Kconfig                 |   8 +++
 net/core/dev.c              |  10 ++-
 net/core/dev_ioctl.c        |   5 +-
 net/core/drop_monitor.c     |   6 +-
 net/core/dst.c              |   8 +--
 net/core/failover.c         |   4 +-
 net/core/link_watch.c       |   4 +-
 net/core/neighbour.c        |  18 ++---
 net/core/net-sysfs.c        |   8 +--
 net/core/netpoll.c          |   4 +-
 net/ethtool/ioctl.c         |   5 +-
 net/ipv4/devinet.c          |   4 +-
 net/ipv4/ipmr.c             |   3 +-
 net/ipv4/route.c            |   7 +-
 net/ipv6/addrconf.c         |   4 +-
 net/ipv6/addrconf_core.c    |   2 +-
 net/ipv6/ip6_gre.c          |   8 +--
 net/ipv6/ip6_tunnel.c       |   4 +-
 net/ipv6/ip6_vti.c          |   4 +-
 net/ipv6/ip6mr.c            |   3 +-
 net/ipv6/route.c            |  10 +--
 net/ipv6/sit.c              |   4 +-
 net/sched/sch_generic.c     |   4 +-
 42 files changed, 509 insertions(+), 62 deletions(-)
 create mode 100644 include/linux/ref_tracker.h
 create mode 100644 lib/ref_tracker.c
 create mode 100644 lib/test_ref_tracker.c

Comments

Jakub Kicinski Dec. 4, 2021, 12:47 a.m. UTC | #1
On Thu,  2 Dec 2021 18:46:17 -0800 Eric Dumazet wrote:
> Two first patches add a generic infrastructure, that will be used
> to get tracking of refcount increments/decrements.
> 
> The general idea is to be able to precisely pair each decrement with
> a corresponding prior increment. Both share a cookie, basically
> a pointer to private data storing stack traces.
> 
> The third place adds dev_hold_track() and dev_put_track() helpers
> (CONFIG_NET_DEV_REFCNT_TRACKER)
> 
> Then a series of 20 patches converts some dev_hold()/dev_put()
> pairs to new hepers : dev_hold_track() and dev_put_track().
> 
> Hopefully this will be used by developpers and syzbot to
> root cause bugs that cause netdevice dismantles freezes.
> 
> With CONFIG_PCPU_DEV_REFCNT=n option, we were able to detect
> some class of bugs, but too late (when too many dev_put()
> were happening).

Hi Eric, there's a handful of kdoc warnings added here:

include/linux/netdevice.h:2278: warning: Function parameter or member 'refcnt_tracker' not described in 'net_device'
include/net/devlink.h:679: warning: Function parameter or member 'dev_tracker' not described in 'devlink_trap_metadata'
include/linux/netdevice.h:2283: warning: Function parameter or member 'refcnt_tracker' not described in 'net_device'
include/linux/mroute_base.h:40: warning: Function parameter or member 'dev_tracker' not described in 'vif_device'

Would you mind following up? likely not worth re-spinning just for that.
Eric Dumazet Dec. 4, 2021, 1 a.m. UTC | #2
On Fri, Dec 3, 2021 at 4:47 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Thu,  2 Dec 2021 18:46:17 -0800 Eric Dumazet wrote:
> > Two first patches add a generic infrastructure, that will be used
> > to get tracking of refcount increments/decrements.
> >
> > The general idea is to be able to precisely pair each decrement with
> > a corresponding prior increment. Both share a cookie, basically
> > a pointer to private data storing stack traces.
> >
> > The third place adds dev_hold_track() and dev_put_track() helpers
> > (CONFIG_NET_DEV_REFCNT_TRACKER)
> >
> > Then a series of 20 patches converts some dev_hold()/dev_put()
> > pairs to new hepers : dev_hold_track() and dev_put_track().
> >
> > Hopefully this will be used by developpers and syzbot to
> > root cause bugs that cause netdevice dismantles freezes.
> >
> > With CONFIG_PCPU_DEV_REFCNT=n option, we were able to detect
> > some class of bugs, but too late (when too many dev_put()
> > were happening).
>
> Hi Eric, there's a handful of kdoc warnings added here:
>
> include/linux/netdevice.h:2278: warning: Function parameter or member 'refcnt_tracker' not described in 'net_device'
> include/net/devlink.h:679: warning: Function parameter or member 'dev_tracker' not described in 'devlink_trap_metadata'
> include/linux/netdevice.h:2283: warning: Function parameter or member 'refcnt_tracker' not described in 'net_device'
> include/linux/mroute_base.h:40: warning: Function parameter or member 'dev_tracker' not described in 'vif_device'
>
> Would you mind following up? likely not worth re-spinning just for that.

Sure thing, I will insert a patch to fix this in the next round.

Thanks !