diff mbox series

net/core: Export dev_core_stats_rx_dropped_inc sets

Message ID 20230911082016.3694700-1-yajun.deng@linux.dev (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series net/core: Export dev_core_stats_rx_dropped_inc sets | expand

Checks

Context Check Description
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next, async
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 5496 this patch: 5496
netdev/cc_maintainers warning 1 maintainers not CCed: daniel@iogearbox.net
netdev/build_clang success Errors and warnings before: 1672 this patch: 1672
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 5876 this patch: 5876
netdev/checkpatch warning CHECK: Macro argument 'FIELD' may be better as '(FIELD)' to avoid precedence issues WARNING: line length of 91 exceeds 80 columns WARNING: line length of 93 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline fail Was 0 now: 1

Commit Message

Yajun Deng Sept. 11, 2023, 8:20 a.m. UTC
Although there is a kfree_skb_reason() helper function that can be used
to find the reason for dropped packets, but most callers didn't increase
one of rx_dropped, tx_dropped, rx_nohandler and rx_otherhost_dropped.

For the users, people are more concerned about why the dropped in ifconfig
is increasing. So we can export dev_core_stats_rx_dropped_inc sets,
which users would trace them know why rx_dropped is increasing.

Export dev_core_stats_{rx_dropped, tx_dropped, rx_nohandler,
rx_otherhost_dropped}_inc for trace. Also, move dev_core_stats()
and netdev_core_stats_alloc() in dev.c, because they are not called
externally.

Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
---
 include/linux/netdevice.h | 32 +++++---------------------------
 net/core/dev.c            | 30 ++++++++++++++++++++++++++++--
 2 files changed, 33 insertions(+), 29 deletions(-)

Comments

Stephen Hemminger Sept. 11, 2023, 4:15 p.m. UTC | #1
On Mon, 11 Sep 2023 16:20:16 +0800
Yajun Deng <yajun.deng@linux.dev> wrote:

> Although there is a kfree_skb_reason() helper function that can be
> used to find the reason for dropped packets, but most callers didn't
> increase one of rx_dropped, tx_dropped, rx_nohandler and
> rx_otherhost_dropped.
> 
> For the users, people are more concerned about why the dropped in
> ifconfig is increasing. So we can export
> dev_core_stats_rx_dropped_inc sets, which users would trace them know
> why rx_dropped is increasing.

ifconfig has been frozen for over 10 years, and is deprecated so there
is no point in catering to legacy api's. There are better API's such as
ethtool and netlink that can provide more info.
Yajun Deng Sept. 12, 2023, 1:42 a.m. UTC | #2
September 12, 2023 at 12:15 AM, "Stephen Hemminger" <stephen@networkplumber.org> wrote:


> 
> On Mon, 11 Sep 2023 16:20:16 +0800
> Yajun Deng <yajun.deng@linux.dev> wrote:
> 
> > 
> > Although there is a kfree_skb_reason() helper function that can be
> >  used to find the reason for dropped packets, but most callers didn't
> >  increase one of rx_dropped, tx_dropped, rx_nohandler and
> >  rx_otherhost_dropped.
> >  
> >  For the users, people are more concerned about why the dropped in
> >  ifconfig is increasing. So we can export
> >  dev_core_stats_rx_dropped_inc sets, which users would trace them know
> >  why rx_dropped is increasing.
> > 
> 
> ifconfig has been frozen for over 10 years, and is deprecated so there
> is no point in catering to legacy api's. There are better API's such as
> ethtool and netlink that can provide more info.
>
Yes, ifconfig is deprecated. but the dropped in ifconfig and ip is the same.
We're more concerned about the reason for dropped packets.
ip, ethtool and netlink couldn't show the reason.
Eric Dumazet Sept. 12, 2023, 4:23 a.m. UTC | #3
On Mon, Sep 11, 2023 at 10:20 AM Yajun Deng <yajun.deng@linux.dev> wrote:
>
> Although there is a kfree_skb_reason() helper function that can be used
> to find the reason for dropped packets, but most callers didn't increase
> one of rx_dropped, tx_dropped, rx_nohandler and rx_otherhost_dropped.
>
> For the users, people are more concerned about why the dropped in ifconfig
> is increasing. So we can export dev_core_stats_rx_dropped_inc sets,
> which users would trace them know why rx_dropped is increasing.
>
> Export dev_core_stats_{rx_dropped, tx_dropped, rx_nohandler,
> rx_otherhost_dropped}_inc for trace. Also, move dev_core_stats()
> and netdev_core_stats_alloc() in dev.c, because they are not called
> externally.
>
> Signed-off-by: Yajun Deng <yajun.deng@linux.dev>

Okay, but it seems you forgot to say which tree was targeted by this patch.

Documentation/process/maintainer-netdev.rst

I would guess net-next, but patch authors are supposed to be explicit.

> ---
>  include/linux/netdevice.h | 32 +++++---------------------------
>  net/core/dev.c            | 30 ++++++++++++++++++++++++++++--
>  2 files changed, 33 insertions(+), 29 deletions(-)
>
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 0896aaa91dd7..879b01c85ba4 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -3954,6 +3954,11 @@ int dev_forward_skb_nomtu(struct net_device *dev, struct sk_buff *skb);
>  bool is_skb_forwardable(const struct net_device *dev,
>                         const struct sk_buff *skb);
>
> +void dev_core_stats_rx_dropped_inc(struct net_device *dev);
> +void dev_core_stats_tx_dropped_inc(struct net_device *dev);
> +void dev_core_stats_rx_nohandler_inc(struct net_device *dev);
> +void dev_core_stats_rx_otherhost_dropped_inc(struct net_device *dev);
> +
>  static __always_inline bool __is_skb_forwardable(const struct net_device *dev,
>                                                  const struct sk_buff *skb,
>                                                  const bool check_mtu)
> @@ -3980,33 +3985,6 @@ static __always_inline bool __is_skb_forwardable(const struct net_device *dev,
>         return false;
>  }
>
> -struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev);
> -
> -static inline struct net_device_core_stats __percpu *dev_core_stats(struct net_device *dev)
> -{
> -       /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */
> -       struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats);
> -
> -       if (likely(p))
> -               return p;
> -
> -       return netdev_core_stats_alloc(dev);
> -}
> -
> -#define DEV_CORE_STATS_INC(FIELD)                                              \
> -static inline void dev_core_stats_##FIELD##_inc(struct net_device *dev)                \
> -{                                                                              \
> -       struct net_device_core_stats __percpu *p;                               \
> -                                                                               \
> -       p = dev_core_stats(dev);                                                \
> -       if (p)                                                                  \
> -               this_cpu_inc(p->FIELD);                                         \
> -}
> -DEV_CORE_STATS_INC(rx_dropped)
> -DEV_CORE_STATS_INC(tx_dropped)
> -DEV_CORE_STATS_INC(rx_nohandler)
> -DEV_CORE_STATS_INC(rx_otherhost_dropped)
> -
>  static __always_inline int ____dev_forward_skb(struct net_device *dev,
>                                                struct sk_buff *skb,
>                                                const bool check_mtu)
> diff --git a/net/core/dev.c b/net/core/dev.c
> index ccff2b6ef958..32ba730405b4 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -10475,7 +10475,7 @@ void netdev_stats_to_stats64(struct rtnl_link_stats64 *stats64,
>  }
>  EXPORT_SYMBOL(netdev_stats_to_stats64);
>
> -struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev)
> +static struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev)
>  {
>         struct net_device_core_stats __percpu *p;
>
> @@ -10488,7 +10488,33 @@ struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device
>         /* This READ_ONCE() pairs with the cmpxchg() above */
>         return READ_ONCE(dev->core_stats);
>  }
> -EXPORT_SYMBOL(netdev_core_stats_alloc);
> +
> +static inline struct net_device_core_stats __percpu *dev_core_stats(struct net_device *dev)

Please remove this inline attritbute. Consider using __cold instead.

> +{
> +       /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */
> +       struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats);
> +
> +       if (likely(p))
> +               return p;
> +
> +       return netdev_core_stats_alloc(dev);
> +}
> +
> +#define DEV_CORE_STATS_INC(FIELD)                              \
> +void dev_core_stats_##FIELD##_inc(struct net_device *dev)      \
> +{                                                              \
> +       struct net_device_core_stats __percpu *p;               \
> +                                                               \
> +       p = dev_core_stats(dev);                                \
> +       if (p)                                                  \
> +               this_cpu_inc(p->FIELD);                         \
> +}                                                              \
> +EXPORT_SYMBOL(dev_core_stats_##FIELD##_inc)
> +
> +DEV_CORE_STATS_INC(rx_dropped);
> +DEV_CORE_STATS_INC(tx_dropped);
> +DEV_CORE_STATS_INC(rx_nohandler);
> +DEV_CORE_STATS_INC(rx_otherhost_dropped);

#undef DEV_CORE_STATS_INC

>
>  /**
>   *     dev_get_stats   - get network device statistics
> --
> 2.25.1
>
Alexander Lobakin Sept. 12, 2023, 3:57 p.m. UTC | #4
From: Eric Dumazet <edumazet@google.com>
Date: Tue, 12 Sep 2023 06:23:24 +0200

> On Mon, Sep 11, 2023 at 10:20 AM Yajun Deng <yajun.deng@linux.dev> wrote:
>>
>> Although there is a kfree_skb_reason() helper function that can be used
>> to find the reason for dropped packets, but most callers didn't increase
>> one of rx_dropped, tx_dropped, rx_nohandler and rx_otherhost_dropped.

[...]

>>  EXPORT_SYMBOL(netdev_stats_to_stats64);
>>
>> -struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev)
>> +static struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev)
>>  {
>>         struct net_device_core_stats __percpu *p;
>>
>> @@ -10488,7 +10488,33 @@ struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device
>>         /* This READ_ONCE() pairs with the cmpxchg() above */
>>         return READ_ONCE(dev->core_stats);
>>  }
>> -EXPORT_SYMBOL(netdev_core_stats_alloc);
>> +
>> +static inline struct net_device_core_stats __percpu *dev_core_stats(struct net_device *dev)
> 
> Please remove this inline attritbute. Consider using __cold instead.

__cold? O_o I thought the author's inlining it as it's a couple
locs/intstructions, while the compilers would most likely keep it
non-inlined as it's referenced 4 times. __cold will for sure keep it
standalone and place it in .text.cold, i.e. far away from the call sites.
I realize dev_core_stats_*() aren't called frequently, but why making
only one small helper cold rather than all of them then?

> 
>> +{
>> +       /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */
>> +       struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats);
>> +
>> +       if (likely(p))
>> +               return p;
>> +
>> +       return netdev_core_stats_alloc(dev);
>> +}

[...]

Thanks,
Olek
Eric Dumazet Sept. 12, 2023, 4:04 p.m. UTC | #5
On Tue, Sep 12, 2023 at 5:58 PM Alexander Lobakin
<aleksander.lobakin@intel.com> wrote:
>
> From: Eric Dumazet <edumazet@google.com>
> Date: Tue, 12 Sep 2023 06:23:24 +0200
>
> > On Mon, Sep 11, 2023 at 10:20 AM Yajun Deng <yajun.deng@linux.dev> wrote:
> >>
> >> Although there is a kfree_skb_reason() helper function that can be used
> >> to find the reason for dropped packets, but most callers didn't increase
> >> one of rx_dropped, tx_dropped, rx_nohandler and rx_otherhost_dropped.
>
> [...]
>
> >>  EXPORT_SYMBOL(netdev_stats_to_stats64);
> >>
> >> -struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev)
> >> +static struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev)
> >>  {
> >>         struct net_device_core_stats __percpu *p;
> >>
> >> @@ -10488,7 +10488,33 @@ struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device
> >>         /* This READ_ONCE() pairs with the cmpxchg() above */
> >>         return READ_ONCE(dev->core_stats);
> >>  }
> >> -EXPORT_SYMBOL(netdev_core_stats_alloc);
> >> +
> >> +static inline struct net_device_core_stats __percpu *dev_core_stats(struct net_device *dev)
> >
> > Please remove this inline attritbute. Consider using __cold instead.
>
> __cold? O_o I thought the author's inlining it as it's a couple
> locs/intstructions, while the compilers would most likely keep it
> non-inlined as it's referenced 4 times. __cold will for sure keep it
> standalone and place it in .text.cold, i.e. far away from the call sites.
> I realize dev_core_stats_*() aren't called frequently, but why making
> only one small helper cold rather than all of them then?
>

This helper is used at least one time per netdevice lifetime.
This is definitely cold.
Forcing an inline makes no sense, this would duplicate the code four times,
for absolutely no gain.

> >
> >> +{
> >> +       /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */
> >> +       struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats);
> >> +
> >> +       if (likely(p))
> >> +               return p;
> >> +
> >> +       return netdev_core_stats_alloc(dev);
> >> +}
>
> [...]
>
> Thanks,
> Olek
Alexander Lobakin Sept. 12, 2023, 4:22 p.m. UTC | #6
From: Yajun Deng <yajun.deng@linux.dev>
Date: Mon, 11 Sep 2023 16:20:16 +0800

> Although there is a kfree_skb_reason() helper function that can be used
> to find the reason for dropped packets, but most callers didn't increase
> one of rx_dropped, tx_dropped, rx_nohandler and rx_otherhost_dropped.

[...]

> diff --git a/net/core/dev.c b/net/core/dev.c
> index ccff2b6ef958..32ba730405b4 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -10475,7 +10475,7 @@ void netdev_stats_to_stats64(struct rtnl_link_stats64 *stats64,
>  }
>  EXPORT_SYMBOL(netdev_stats_to_stats64);
>  
> -struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev)
> +static struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev)
>  {
>  	struct net_device_core_stats __percpu *p;
>  
> @@ -10488,7 +10488,33 @@ struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device
>  	/* This READ_ONCE() pairs with the cmpxchg() above */
>  	return READ_ONCE(dev->core_stats);
>  }
> -EXPORT_SYMBOL(netdev_core_stats_alloc);
> +
> +static inline struct net_device_core_stats __percpu *dev_core_stats(struct net_device *dev)
> +{
> +	/* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */
> +	struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats);
> +
> +	if (likely(p))
> +		return p;
> +
> +	return netdev_core_stats_alloc(dev);
> +}
> +
> +#define DEV_CORE_STATS_INC(FIELD)				\
> +void dev_core_stats_##FIELD##_inc(struct net_device *dev)	\
> +{								\
> +	struct net_device_core_stats __percpu *p;		\
> +								\
> +	p = dev_core_stats(dev);				\
> +	if (p)							\
> +		this_cpu_inc(p->FIELD);				\
> +}								\
> +EXPORT_SYMBOL(dev_core_stats_##FIELD##_inc)
> +
> +DEV_CORE_STATS_INC(rx_dropped);
> +DEV_CORE_STATS_INC(tx_dropped);
> +DEV_CORE_STATS_INC(rx_nohandler);
> +DEV_CORE_STATS_INC(rx_otherhost_dropped);

I realize you need to have an external function to be able to trace it,
but why don't you make it just 1 function instead of 4+ (will only be
increasing)?

Define 1 function

void dev_core_stats_inc(struct net_device *dev, u32 offset)
{
	struct net_device_core_stats __percpu *p;

	p = dev_core_stats(dev);
	if (p)
		this_cpu_inc(*(unsigned long *)(void *)p + offset);
}
EXPORT_SYMBOL_GPL(dev_core_stats_inc); // Why not GPL BTW?

And then build inlines:

#define DEV_CORE_STATS_INC(FIELD)				\
static inline void						\
dev_core_stats_##FIELD##_inc(struct net_device *dev)		\
{								\
	dev_core_stats_inc(dev,					\
		offsetof(struct net_device_core_stats, FIELD));	\
}

DEV_CORE_STATS_INC(rx_dropped);
...

OR even just make them macros

#define __DEV_CORE_STATS_INC(dev, field)			\
	dev_core_stats_inc(dev,					\
		offsetof(struct net_device_core_stats, field))

#define dev_core_stats_rx_dropped_inc(dev)			\
	__DEV_CORE_STATS_INC(dev, rx_dropped)
...

Just don't copy that awful Thunderbird's line wrap and don't assume this
code builds and works and that is something finished/polished.

You'll be able to trace functions and you'll be able to understand which
counter has been incremented by checking the second argument, i.e. the
field offset (IIRC tracing shows you arguments).
And that way you wouldn't geometrically increase the number of symbol
exports and deal with its consequences.

>  
>  /**
>   *	dev_get_stats	- get network device statistics

Thanks,
Olek
Alexander Lobakin Sept. 12, 2023, 5:15 p.m. UTC | #7
From: Eric Dumazet <edumazet@google.com>
Date: Tue, 12 Sep 2023 18:04:44 +0200

> On Tue, Sep 12, 2023 at 5:58 PM Alexander Lobakin
> <aleksander.lobakin@intel.com> wrote:
>>
>> From: Eric Dumazet <edumazet@google.com>
>> Date: Tue, 12 Sep 2023 06:23:24 +0200
>>
>>> On Mon, Sep 11, 2023 at 10:20 AM Yajun Deng <yajun.deng@linux.dev> wrote:
>>>>
>>>> Although there is a kfree_skb_reason() helper function that can be used
>>>> to find the reason for dropped packets, but most callers didn't increase
>>>> one of rx_dropped, tx_dropped, rx_nohandler and rx_otherhost_dropped.
>>
>> [...]
>>
>>>>  EXPORT_SYMBOL(netdev_stats_to_stats64);
>>>>
>>>> -struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev)
>>>> +static struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev)
>>>>  {
>>>>         struct net_device_core_stats __percpu *p;
>>>>
>>>> @@ -10488,7 +10488,33 @@ struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device
>>>>         /* This READ_ONCE() pairs with the cmpxchg() above */
>>>>         return READ_ONCE(dev->core_stats);
>>>>  }
>>>> -EXPORT_SYMBOL(netdev_core_stats_alloc);
>>>> +
>>>> +static inline struct net_device_core_stats __percpu *dev_core_stats(struct net_device *dev)
>>>
>>> Please remove this inline attritbute. Consider using __cold instead.
>>
>> __cold? O_o I thought the author's inlining it as it's a couple
>> locs/intstructions, while the compilers would most likely keep it
>> non-inlined as it's referenced 4 times. __cold will for sure keep it
>> standalone and place it in .text.cold, i.e. far away from the call sites.
>> I realize dev_core_stats_*() aren't called frequently, but why making
>> only one small helper cold rather than all of them then?
>>
> 
> This helper is used at least one time per netdevice lifetime.
> This is definitely cold.

But then each dev_stats_*_inc() (not cold) has to call it from a
completely different piece of .text far from their. I either don't
understand the idea or dunno. Why not make them cold as well then?

> Forcing an inline makes no sense, this would duplicate the code four times,
> for absolutely no gain.

I'd love to see bloat-o-meter numbers, I suspect we're talking about
20-30 bytes.

> 
>>>
>>>> +{
>>>> +       /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */
>>>> +       struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats);
>>>> +
>>>> +       if (likely(p))
>>>> +               return p;
>>>> +
>>>> +       return netdev_core_stats_alloc(dev);
>>>> +}
>>
>> [...]
>>
>> Thanks,
>> Olek

Thanks,
Olek
Eric Dumazet Sept. 12, 2023, 5:28 p.m. UTC | #8
On Tue, Sep 12, 2023 at 7:16 PM Alexander Lobakin
<aleksander.lobakin@intel.com> wrote:
>
> From: Eric Dumazet <edumazet@google.com>
> Date: Tue, 12 Sep 2023 18:04:44 +0200
>
> > On Tue, Sep 12, 2023 at 5:58 PM Alexander Lobakin
> > <aleksander.lobakin@intel.com> wrote:
> >>
> >> From: Eric Dumazet <edumazet@google.com>
> >> Date: Tue, 12 Sep 2023 06:23:24 +0200
> >>
> >>> On Mon, Sep 11, 2023 at 10:20 AM Yajun Deng <yajun.deng@linux.dev> wrote:
> >>>>
> >>>> Although there is a kfree_skb_reason() helper function that can be used
> >>>> to find the reason for dropped packets, but most callers didn't increase
> >>>> one of rx_dropped, tx_dropped, rx_nohandler and rx_otherhost_dropped.
> >>
> >> [...]
> >>
> >>>>  EXPORT_SYMBOL(netdev_stats_to_stats64);
> >>>>
> >>>> -struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev)
> >>>> +static struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev)
> >>>>  {
> >>>>         struct net_device_core_stats __percpu *p;
> >>>>
> >>>> @@ -10488,7 +10488,33 @@ struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device
> >>>>         /* This READ_ONCE() pairs with the cmpxchg() above */
> >>>>         return READ_ONCE(dev->core_stats);
> >>>>  }
> >>>> -EXPORT_SYMBOL(netdev_core_stats_alloc);
> >>>> +
> >>>> +static inline struct net_device_core_stats __percpu *dev_core_stats(struct net_device *dev)
> >>>
> >>> Please remove this inline attritbute. Consider using __cold instead.
> >>
> >> __cold? O_o I thought the author's inlining it as it's a couple
> >> locs/intstructions, while the compilers would most likely keep it
> >> non-inlined as it's referenced 4 times. __cold will for sure keep it
> >> standalone and place it in .text.cold, i.e. far away from the call sites.
> >> I realize dev_core_stats_*() aren't called frequently, but why making
> >> only one small helper cold rather than all of them then?
> >>
> >
> > This helper is used at least one time per netdevice lifetime.
> > This is definitely cold.
>
> But then each dev_stats_*_inc() (not cold) has to call it from a
> completely different piece of .text far from their. I either don't
> understand the idea or dunno. Why not make them cold as well then?
>

The __cold attribute is only applied to the helper _allocating_ the
memory, once.

Not on the functions actually incrementing the stats.

There are situations where they can be called thousands/millions of
times per second (incast flood).
If this situation happens, the _allocation_ still happens once.



> > Forcing an inline makes no sense, this would duplicate the code four times,
> > for absolutely no gain.
>
> I'd love to see bloat-o-meter numbers, I suspect we're talking about
> 20-30 bytes.
>
> >
> >>>
> >>>> +{
> >>>> +       /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */
> >>>> +       struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats);
> >>>> +
> >>>> +       if (likely(p))
> >>>> +               return p;
> >>>> +
> >>>> +       return netdev_core_stats_alloc(dev);
> >>>> +}
> >>
> >> [...]
> >>
> >> Thanks,
> >> Olek
>
> Thanks,
> Olek
Alexander Lobakin Sept. 12, 2023, 5:43 p.m. UTC | #9
From: Eric Dumazet <edumazet@google.com>
Date: Tue, 12 Sep 2023 19:28:50 +0200

> On Tue, Sep 12, 2023 at 7:16 PM Alexander Lobakin
> <aleksander.lobakin@intel.com> wrote:
>>
>> From: Eric Dumazet <edumazet@google.com>
>> Date: Tue, 12 Sep 2023 18:04:44 +0200
>>
>>> On Tue, Sep 12, 2023 at 5:58 PM Alexander Lobakin
>>> <aleksander.lobakin@intel.com> wrote:
>>>>
>>>> From: Eric Dumazet <edumazet@google.com>
>>>> Date: Tue, 12 Sep 2023 06:23:24 +0200
>>>>
>>>>> On Mon, Sep 11, 2023 at 10:20 AM Yajun Deng <yajun.deng@linux.dev> wrote:
>>>>>>
>>>>>> Although there is a kfree_skb_reason() helper function that can be used
>>>>>> to find the reason for dropped packets, but most callers didn't increase
>>>>>> one of rx_dropped, tx_dropped, rx_nohandler and rx_otherhost_dropped.
>>>>
>>>> [...]
>>>>
>>>>>>  EXPORT_SYMBOL(netdev_stats_to_stats64);
>>>>>>
>>>>>> -struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev)
>>>>>> +static struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev)
>>>>>>  {
>>>>>>         struct net_device_core_stats __percpu *p;
>>>>>>
>>>>>> @@ -10488,7 +10488,33 @@ struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device
>>>>>>         /* This READ_ONCE() pairs with the cmpxchg() above */
>>>>>>         return READ_ONCE(dev->core_stats);
>>>>>>  }
>>>>>> -EXPORT_SYMBOL(netdev_core_stats_alloc);
>>>>>> +
>>>>>> +static inline struct net_device_core_stats __percpu *dev_core_stats(struct net_device *dev)
>>>>>
>>>>> Please remove this inline attritbute. Consider using __cold instead.
>>>>
>>>> __cold? O_o I thought the author's inlining it as it's a couple
>>>> locs/intstructions, while the compilers would most likely keep it
>>>> non-inlined as it's referenced 4 times. __cold will for sure keep it
>>>> standalone and place it in .text.cold, i.e. far away from the call sites.
>>>> I realize dev_core_stats_*() aren't called frequently, but why making
>>>> only one small helper cold rather than all of them then?
>>>>
>>>
>>> This helper is used at least one time per netdevice lifetime.
>>> This is definitely cold.
>>
>> But then each dev_stats_*_inc() (not cold) has to call it from a
>> completely different piece of .text far from their. I either don't
>> understand the idea or dunno. Why not make them cold as well then?
>>
> 
> The __cold attribute is only applied to the helper _allocating_ the
> memory, once.

Then it should be applied to netdev_core_stats_alloc(), not
dev_core_stats(). The latter only dereferences the already existing
pointer or calls the former, which actually does the allocation.
That's why I don't get why make one if/else non-inline or even cold.

> 
> Not on the functions actually incrementing the stats.
> 
> There are situations where they can be called thousands/millions of
> times per second (incast flood).
> If this situation happens, the _allocation_ still happens once.

Correct, but dev_core_stats() will be called the same millions of times
per second, see above. It's called unconditionally each increment.

So seems like I got the idea of .cold correctly, but you were referring
to the wrong function.

> 
> 
> 
>>> Forcing an inline makes no sense, this would duplicate the code four times,
>>> for absolutely no gain.
>>
>> I'd love to see bloat-o-meter numbers, I suspect we're talking about
>> 20-30 bytes.
>>
>>>
>>>>>
>>>>>> +{
>>>>>> +       /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */
>>>>>> +       struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats);
>>>>>> +
>>>>>> +       if (likely(p))
>>>>>> +               return p;
>>>>>> +
>>>>>> +       return netdev_core_stats_alloc(dev);
>>>>>> +}
>>>>
>>>> [...]
>>>>
>>>> Thanks,
>>>> Olek
>>
>> Thanks,
>> Olek

Thanks,
Olek
Eric Dumazet Sept. 12, 2023, 6:03 p.m. UTC | #10
On Tue, Sep 12, 2023 at 7:44 PM Alexander Lobakin
<aleksander.lobakin@intel.com> wrote:
>
> From: Eric Dumazet <edumazet@google.com>
> Date: Tue, 12 Sep 2023 19:28:50 +0200
>
> > On Tue, Sep 12, 2023 at 7:16 PM Alexander Lobakin
> > <aleksander.lobakin@intel.com> wrote:
> >>
> >> From: Eric Dumazet <edumazet@google.com>
> >> Date: Tue, 12 Sep 2023 18:04:44 +0200
> >>
> >>> On Tue, Sep 12, 2023 at 5:58 PM Alexander Lobakin
> >>> <aleksander.lobakin@intel.com> wrote:
> >>>>
> >>>> From: Eric Dumazet <edumazet@google.com>
> >>>> Date: Tue, 12 Sep 2023 06:23:24 +0200
> >>>>
> >>>>> On Mon, Sep 11, 2023 at 10:20 AM Yajun Deng <yajun.deng@linux.dev> wrote:
> >>>>>>
> >>>>>> Although there is a kfree_skb_reason() helper function that can be used
> >>>>>> to find the reason for dropped packets, but most callers didn't increase
> >>>>>> one of rx_dropped, tx_dropped, rx_nohandler and rx_otherhost_dropped.
> >>>>
> >>>> [...]
> >>>>
> >>>>>>  EXPORT_SYMBOL(netdev_stats_to_stats64);
> >>>>>>
> >>>>>> -struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev)
> >>>>>> +static struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev)
> >>>>>>  {
> >>>>>>         struct net_device_core_stats __percpu *p;
> >>>>>>
> >>>>>> @@ -10488,7 +10488,33 @@ struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device
> >>>>>>         /* This READ_ONCE() pairs with the cmpxchg() above */
> >>>>>>         return READ_ONCE(dev->core_stats);
> >>>>>>  }
> >>>>>> -EXPORT_SYMBOL(netdev_core_stats_alloc);
> >>>>>> +
> >>>>>> +static inline struct net_device_core_stats __percpu *dev_core_stats(struct net_device *dev)
> >>>>>
> >>>>> Please remove this inline attritbute. Consider using __cold instead.
> >>>>
> >>>> __cold? O_o I thought the author's inlining it as it's a couple
> >>>> locs/intstructions, while the compilers would most likely keep it
> >>>> non-inlined as it's referenced 4 times. __cold will for sure keep it
> >>>> standalone and place it in .text.cold, i.e. far away from the call sites.
> >>>> I realize dev_core_stats_*() aren't called frequently, but why making
> >>>> only one small helper cold rather than all of them then?
> >>>>
> >>>
> >>> This helper is used at least one time per netdevice lifetime.
> >>> This is definitely cold.
> >>
> >> But then each dev_stats_*_inc() (not cold) has to call it from a
> >> completely different piece of .text far from their. I either don't
> >> understand the idea or dunno. Why not make them cold as well then?
> >>
> >
> > The __cold attribute is only applied to the helper _allocating_ the
> > memory, once.
>
> Then it should be applied to netdev_core_stats_alloc(), not
> dev_core_stats(). The latter only dereferences the already existing
> pointer or calls the former, which actually does the allocation.
> That's why I don't get why make one if/else non-inline or even cold.

Sure, this was what was suggested (perhaps not _very_ precisely, but
the general idea was pretty clear).
v2 seems ok, right ?

It seems we are all on the same page.

+static __cold struct net_device_core_stats __percpu
*dev_core_stats(struct net_device *dev)
+{
+       /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */
+       struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats);
+
+       if (likely(p))
+               return p;
+
+       return netdev_core_stats_alloc(dev);
+}
+
+#define DEV_CORE_STATS_INC(FIELD)                              \
+void dev_core_stats_##FIELD##_inc(struct net_device *dev)      \
+{                                                              \
+       struct net_device_core_stats __percpu *p;               \
+                                                               \
+       p = dev_core_stats(dev);                                \
+       if (p)                                                  \
+               this_cpu_inc(p->FIELD);                         \
+}                                                              \
+EXPORT_SYMBOL(dev_core_stats_##FIELD##_inc)
Eric Dumazet Sept. 12, 2023, 6:05 p.m. UTC | #11
On Tue, Sep 12, 2023 at 8:03 PM Eric Dumazet <edumazet@google.com> wrote:

> Sure, this was what was suggested (perhaps not _very_ precisely, but
> the general idea was pretty clear).
> v2 seems ok, right ?
>
> It seems we are all on the same page.
>
> +static __cold struct net_device_core_stats __percpu
> *dev_core_stats(struct net_device *dev)
> +{
> +       /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */
> +       struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats);
> +
> +       if (likely(p))
> +               return p;
> +
> +       return netdev_core_stats_alloc(dev);
> +}
> +
> +#define DEV_CORE_STATS_INC(FIELD)                              \
> +void dev_core_stats_##FIELD##_inc(struct net_device *dev)      \
> +{                                                              \
> +       struct net_device_core_stats __percpu *p;               \
> +                                                               \
> +       p = dev_core_stats(dev);                                \
> +       if (p)                                                  \
> +               this_cpu_inc(p->FIELD);                         \
> +}                                                              \
> +EXPORT_SYMBOL(dev_core_stats_##FIELD##_inc)

Oh well, I just read the patch, and it seems wrong indeed.

netdev_core_stats_alloc() is the one that can be cold.
Yajun Deng Sept. 13, 2023, 2:20 a.m. UTC | #12
On 2023/9/13 02:05, Eric Dumazet wrote:
> On Tue, Sep 12, 2023 at 8:03 PM Eric Dumazet <edumazet@google.com> wrote:
>
>> Sure, this was what was suggested (perhaps not _very_ precisely, but
>> the general idea was pretty clear).
>> v2 seems ok, right ?
>>
>> It seems we are all on the same page.
>>
>> +static __cold struct net_device_core_stats __percpu
>> *dev_core_stats(struct net_device *dev)
>> +{
>> +       /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */
>> +       struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats);
>> +
>> +       if (likely(p))
>> +               return p;
>> +
>> +       return netdev_core_stats_alloc(dev);
>> +}
>> +
>> +#define DEV_CORE_STATS_INC(FIELD)                              \
>> +void dev_core_stats_##FIELD##_inc(struct net_device *dev)      \
>> +{                                                              \
>> +       struct net_device_core_stats __percpu *p;               \
>> +                                                               \
>> +       p = dev_core_stats(dev);                                \
>> +       if (p)                                                  \
>> +               this_cpu_inc(p->FIELD);                         \
>> +}                                                              \
>> +EXPORT_SYMBOL(dev_core_stats_##FIELD##_inc)
> Oh well, I just read the patch, and it seems wrong indeed.
>
> netdev_core_stats_alloc() is the one that can be cold.

Okay, I would add __cold to netdev_core_stats_alloc() in v3.

Olek suggest that define a new dev_core_stats_inc() function.

I hope to see the suggestion in another reply.
Alexander Lobakin Sept. 13, 2023, 9:58 a.m. UTC | #13
From: Yajun Deng <yajun.deng@linux.dev>
Date: Wed, 13 Sep 2023 10:08:08 +0800

> 
> On 2023/9/13 00:22, Alexander Lobakin wrote:
>> From: Yajun Deng <yajun.deng@linux.dev>
>> Date: Mon, 11 Sep 2023 16:20:16 +0800

[...]

>> EXPORT_SYMBOL_GPL(dev_core_stats_inc); // Why not GPL BTW?
> 
> This may be a better option.
> 
> Just because EXPORT_SYMBOL(netdev_core_stats_alloc) before,  but I think
> 
> EXPORT_SYMBOL_GPL is better.

Ah I see. BTW, if you will still define increment functions as
externals, there will be no reason to export netdev_core_stats_alloc()
or even make it non-static at all.

> 
>  
>> And then build inlines:
>>
>> #define DEV_CORE_STATS_INC(FIELD)				\
>> static inline void						\
>> dev_core_stats_##FIELD##_inc(struct net_device *dev)		\
>> {								\
>> 	dev_core_stats_inc(dev,					\
>> 		offsetof(struct net_device_core_stats, FIELD));	\
>> }
>>
>> DEV_CORE_STATS_INC(rx_dropped);
>> ...
>>
>> OR even just make them macros
>>
>> #define __DEV_CORE_STATS_INC(dev, field)			\
>> 	dev_core_stats_inc(dev,					\
>> 		offsetof(struct net_device_core_stats, field))
>>
>> #define dev_core_stats_rx_dropped_inc(dev)			\
>> 	__DEV_CORE_STATS_INC(dev, rx_dropped)
>> ...
> 
> I would like the former.  Keep it the same as before.

By "the former" you mean to build static inlines or externals? Seems
like the first one, but I got confused by your "the same as before" :D

> 
> 
>> Just don't copy that awful Thunderbird's line wrap and don't assume this
>> code builds and works and that is something finished/polished.
>>
>> You'll be able to trace functions and you'll be able to understand which
>> counter has been incremented by checking the second argument, i.e. the
>> field offset (IIRC tracing shows you arguments).
>> And that way you wouldn't geometrically increase the number of symbol
>> exports and deal with its consequences.
> I agree that.

Ok, after this one I guess you meant "I'd like to use your approach with
static inlines".

>>>  
>>>  /**
>>>   *	dev_get_stats	- get network device statistics
>> Thanks,
>> Olek

Thanks,
Olek
Yajun Deng Sept. 14, 2023, 2:44 a.m. UTC | #14
On 2023/9/13 17:58, Alexander Lobakin wrote:
> From: Yajun Deng <yajun.deng@linux.dev>
> Date: Wed, 13 Sep 2023 10:08:08 +0800
>
>> On 2023/9/13 00:22, Alexander Lobakin wrote:
>>> From: Yajun Deng <yajun.deng@linux.dev>
>>> Date: Mon, 11 Sep 2023 16:20:16 +0800
> [...]
>
>>> EXPORT_SYMBOL_GPL(dev_core_stats_inc); // Why not GPL BTW?
>> This may be a better option.
>>
>> Just because EXPORT_SYMBOL(netdev_core_stats_alloc) before,  but I think
>>
>> EXPORT_SYMBOL_GPL is better.
> Ah I see. BTW, if you will still define increment functions as
> externals, there will be no reason to export netdev_core_stats_alloc()
> or even make it non-static at all.
>
>>   
>>> And then build inlines:
>>>
>>> #define DEV_CORE_STATS_INC(FIELD)				\
>>> static inline void						\
>>> dev_core_stats_##FIELD##_inc(struct net_device *dev)		\
>>> {								\
>>> 	dev_core_stats_inc(dev,					\
>>> 		offsetof(struct net_device_core_stats, FIELD));	\
>>> }
>>>
>>> DEV_CORE_STATS_INC(rx_dropped);
>>> ...
>>>
>>> OR even just make them macros
>>>
>>> #define __DEV_CORE_STATS_INC(dev, field)			\
>>> 	dev_core_stats_inc(dev,					\
>>> 		offsetof(struct net_device_core_stats, field))
>>>
>>> #define dev_core_stats_rx_dropped_inc(dev)			\
>>> 	__DEV_CORE_STATS_INC(dev, rx_dropped)
>>> ...
>> I would like the former.  Keep it the same as before.
> By "the former" you mean to build static inlines or externals? Seems
> like the first one, but I got confused by your "the same as before" :D
>
>>
>>> Just don't copy that awful Thunderbird's line wrap and don't assume this
>>> code builds and works and that is something finished/polished.
>>>
>>> You'll be able to trace functions and you'll be able to understand which
>>> counter has been incremented by checking the second argument, i.e. the
>>> field offset (IIRC tracing shows you arguments).
>>> And that way you wouldn't geometrically increase the number of symbol
>>> exports and deal with its consequences.
>> I agree that.
> Ok, after this one I guess you meant "I'd like to use your approach with
> static inlines".

Finally, I give up this approach.

The new function dev_core_stats_inc() didn't called by external modules 
directly.

So EXPORT_SYMBOL_GPL(dev_core_stats_inc) can be removed by anyone.


>>>>   
>>>>   /**
>>>>    *	dev_get_stats	- get network device statistics
>>> Thanks,
>>> Olek
> Thanks,
> Olek
Alexander Lobakin Sept. 14, 2023, 2:10 p.m. UTC | #15
From: Yajun Deng <yajun.deng@linux.dev>
Date: Thu, 14 Sep 2023 10:44:14 +0800

> 
> On 2023/9/13 17:58, Alexander Lobakin wrote:
>> From: Yajun Deng <yajun.deng@linux.dev>
>> Date: Wed, 13 Sep 2023 10:08:08 +0800
>>
>>> On 2023/9/13 00:22, Alexander Lobakin wrote:
>>>> From: Yajun Deng <yajun.deng@linux.dev>
>>>> Date: Mon, 11 Sep 2023 16:20:16 +0800
>> [...]
>>
>>>> EXPORT_SYMBOL_GPL(dev_core_stats_inc); // Why not GPL BTW?
>>> This may be a better option.
>>>
>>> Just because EXPORT_SYMBOL(netdev_core_stats_alloc) before,  but I think
>>>
>>> EXPORT_SYMBOL_GPL is better.
>> Ah I see. BTW, if you will still define increment functions as
>> externals, there will be no reason to export netdev_core_stats_alloc()
>> or even make it non-static at all.
>>
>>>  
>>>> And then build inlines:
>>>>
>>>> #define DEV_CORE_STATS_INC(FIELD)                \
>>>> static inline void                        \
>>>> dev_core_stats_##FIELD##_inc(struct net_device *dev)        \
>>>> {                                \
>>>>     dev_core_stats_inc(dev,                    \
>>>>         offsetof(struct net_device_core_stats, FIELD));    \
>>>> }
>>>>
>>>> DEV_CORE_STATS_INC(rx_dropped);
>>>> ...
>>>>
>>>> OR even just make them macros
>>>>
>>>> #define __DEV_CORE_STATS_INC(dev, field)            \
>>>>     dev_core_stats_inc(dev,                    \
>>>>         offsetof(struct net_device_core_stats, field))
>>>>
>>>> #define dev_core_stats_rx_dropped_inc(dev)            \
>>>>     __DEV_CORE_STATS_INC(dev, rx_dropped)
>>>> ...
>>> I would like the former.  Keep it the same as before.
>> By "the former" you mean to build static inlines or externals? Seems
>> like the first one, but I got confused by your "the same as before" :D
>>
>>>
>>>> Just don't copy that awful Thunderbird's line wrap and don't assume
>>>> this
>>>> code builds and works and that is something finished/polished.
>>>>
>>>> You'll be able to trace functions and you'll be able to understand
>>>> which
>>>> counter has been incremented by checking the second argument, i.e. the
>>>> field offset (IIRC tracing shows you arguments).
>>>> And that way you wouldn't geometrically increase the number of symbol
>>>> exports and deal with its consequences.
>>> I agree that.
>> Ok, after this one I guess you meant "I'd like to use your approach with
>> static inlines".
> 
> Finally, I give up this approach.
> 
> The new function dev_core_stats_inc() didn't called by external modules
> directly.

If it's called via an inline or macro or whatever, it still needs to be
exported.
Double-check that modpost doesn't complain on allmodconfig build.

> 
> So EXPORT_SYMBOL_GPL(dev_core_stats_inc) can be removed by anyone.

That doesn't mean it won't be needed tomorrow.
And I don't feel like it's a good excuse to define 1 external function
per counter instead of 1 external + static inlines for the rest. It's
not only about the exports.
Esp. given that I wrote almost the whole code needed for it to work in
one of my previous replies.

If you don't want to do that, I could take it over xD

> 
> 
>>>>>     /**
>>>>>    *    dev_get_stats    - get network device statistics
>>>> Thanks,
>>>> Olek
>> Thanks,
>> Olek

Thanks,
Olek
diff mbox series

Patch

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 0896aaa91dd7..879b01c85ba4 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3954,6 +3954,11 @@  int dev_forward_skb_nomtu(struct net_device *dev, struct sk_buff *skb);
 bool is_skb_forwardable(const struct net_device *dev,
 			const struct sk_buff *skb);
 
+void dev_core_stats_rx_dropped_inc(struct net_device *dev);
+void dev_core_stats_tx_dropped_inc(struct net_device *dev);
+void dev_core_stats_rx_nohandler_inc(struct net_device *dev);
+void dev_core_stats_rx_otherhost_dropped_inc(struct net_device *dev);
+
 static __always_inline bool __is_skb_forwardable(const struct net_device *dev,
 						 const struct sk_buff *skb,
 						 const bool check_mtu)
@@ -3980,33 +3985,6 @@  static __always_inline bool __is_skb_forwardable(const struct net_device *dev,
 	return false;
 }
 
-struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev);
-
-static inline struct net_device_core_stats __percpu *dev_core_stats(struct net_device *dev)
-{
-	/* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */
-	struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats);
-
-	if (likely(p))
-		return p;
-
-	return netdev_core_stats_alloc(dev);
-}
-
-#define DEV_CORE_STATS_INC(FIELD)						\
-static inline void dev_core_stats_##FIELD##_inc(struct net_device *dev)		\
-{										\
-	struct net_device_core_stats __percpu *p;				\
-										\
-	p = dev_core_stats(dev);						\
-	if (p)									\
-		this_cpu_inc(p->FIELD);						\
-}
-DEV_CORE_STATS_INC(rx_dropped)
-DEV_CORE_STATS_INC(tx_dropped)
-DEV_CORE_STATS_INC(rx_nohandler)
-DEV_CORE_STATS_INC(rx_otherhost_dropped)
-
 static __always_inline int ____dev_forward_skb(struct net_device *dev,
 					       struct sk_buff *skb,
 					       const bool check_mtu)
diff --git a/net/core/dev.c b/net/core/dev.c
index ccff2b6ef958..32ba730405b4 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10475,7 +10475,7 @@  void netdev_stats_to_stats64(struct rtnl_link_stats64 *stats64,
 }
 EXPORT_SYMBOL(netdev_stats_to_stats64);
 
-struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev)
+static struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev)
 {
 	struct net_device_core_stats __percpu *p;
 
@@ -10488,7 +10488,33 @@  struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device
 	/* This READ_ONCE() pairs with the cmpxchg() above */
 	return READ_ONCE(dev->core_stats);
 }
-EXPORT_SYMBOL(netdev_core_stats_alloc);
+
+static inline struct net_device_core_stats __percpu *dev_core_stats(struct net_device *dev)
+{
+	/* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */
+	struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats);
+
+	if (likely(p))
+		return p;
+
+	return netdev_core_stats_alloc(dev);
+}
+
+#define DEV_CORE_STATS_INC(FIELD)				\
+void dev_core_stats_##FIELD##_inc(struct net_device *dev)	\
+{								\
+	struct net_device_core_stats __percpu *p;		\
+								\
+	p = dev_core_stats(dev);				\
+	if (p)							\
+		this_cpu_inc(p->FIELD);				\
+}								\
+EXPORT_SYMBOL(dev_core_stats_##FIELD##_inc)
+
+DEV_CORE_STATS_INC(rx_dropped);
+DEV_CORE_STATS_INC(tx_dropped);
+DEV_CORE_STATS_INC(rx_nohandler);
+DEV_CORE_STATS_INC(rx_otherhost_dropped);
 
 /**
  *	dev_get_stats	- get network device statistics