diff mbox series

[net-next,3/3] net: xilinx: axienet: Add statistics support

Message ID 20240610231022.2460953-4-sean.anderson@linux.dev (mailing list archive)
State New, archived
Headers show
Series net: xilinx: axienet: Add statistics support | expand

Commit Message

Sean Anderson June 10, 2024, 11:10 p.m. UTC
Add support for reading the statistics counters, if they are enabled.
The counters may be 64-bit, but we can't detect this as there's no
ability bit for it and the counters are read-only. Therefore, we assume
the counters are 32-bits. To ensure we don't miss an overflow, we need
to read all counters at regular intervals, configurable with
stats-block-usecs. This should be often enough to ensure the bytes
counters don't wrap at 2.5 Gbit/s.

Another complication is that the counters may be reset when the device
is reset (depending on configuration). To ensure the counters persist
across link up/down (including suspend/resume), we maintain our own
64-bit versions along with the last counter value we saw. Because we
might wait up to 100 ms for the reset to complete, we use a mutex to
protect writing hw_stats. We can't sleep in ndo_get_stats64, so we use a
u64_stats_sync to protect readers.

We can't use the byte counters for either get_stats64 or
get_eth_mac_stats. This is because the byte counters include everything
in the frame (destination address to FCS, inclusive). But
rtnl_link_stats64 wants bytes excluding the FCS, and
ethtool_eth_mac_stats wants to exclude the L2 overhead (addresses and
length/type).

Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
---

 drivers/net/ethernet/xilinx/xilinx_axienet.h  |  81 ++++++
 .../net/ethernet/xilinx/xilinx_axienet_main.c | 267 +++++++++++++++++-
 2 files changed, 345 insertions(+), 3 deletions(-)

Comments

Andrew Lunn June 11, 2024, 12:13 a.m. UTC | #1
On Mon, Jun 10, 2024 at 07:10:22PM -0400, Sean Anderson wrote:
> Add support for reading the statistics counters, if they are enabled.
> The counters may be 64-bit, but we can't detect this as there's no
> ability bit for it and the counters are read-only. Therefore, we assume
> the counters are 32-bits.

> +static void axienet_stats_update(struct axienet_local *lp)
> +{
> +	enum temac_stat stat;
> +
> +	lockdep_assert_held(&lp->stats_lock);
> +
> +	u64_stats_update_begin(&lp->hw_stat_sync);
> +	for (stat = 0; stat < STAT_COUNT; stat++) {
> +		u32 counter = axienet_ior(lp, XAE_STATS_OFFSET + stat * 8);

The * 8 here suggests the counters are spaced so that they could be 64
bit wide, even when only 32 bits are used. Does the documentation say
anything about the upper 32 bits when the counters are only 32 bits?
Are they guaranteed to read as zero? I'm just wondering if the code
should be forward looking and read all 64 bits? 

>  static int __axienet_device_reset(struct axienet_local *lp)
>  {
>  	u32 value;
>  	int ret;
>  
> +	/* Save statistics counters in case they will be reset */
> +	if (lp->features & XAE_FEATURE_STATS) {
> +		mutex_lock(&lp->stats_lock);
> +		axienet_stats_update(lp);
> +	}

It is a pretty unusual pattern to split a mutex lock/unlock like this
on an if statement. Maybe just unconditionally hold the mutex? This
does not appear to be anyway hot path, so the overhead should not
matter.

	Andrew
Andrew Lunn June 11, 2024, 12:26 a.m. UTC | #2
> +static u64 axienet_stat(struct axienet_local *lp, enum temac_stat stat)
> +{
> +	return u64_stats_read(&lp->hw_stats[stat]);
> +}
> @@ -1695,6 +1760,35 @@ axienet_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
>  		stats->tx_packets = u64_stats_read(&lp->tx_packets);
>  		stats->tx_bytes = u64_stats_read(&lp->tx_bytes);
>  	} while (u64_stats_fetch_retry(&lp->tx_stat_sync, start));
> +
> +	if (!(lp->features & XAE_FEATURE_STATS))
> +		return;
> +
> +	do {
> +		start = u64_stats_fetch_begin(&lp->hw_stat_sync);
> +		stats->rx_length_errors =
> +			axienet_stat(lp, STAT_RX_LENGTH_ERRORS);

I'm i reading this correctly. You are returning the counters from the
last refresh period. What is that? 2.5Gbps would wrapper around a 32
byte counter in 13 seconds. I hope these statistics are not 13 seconds
out of date?

Since axienet_stats_update() also uses the lp->hw_stat_sync, i don't
see why you cannot read the hardware counter value and update to the
latest value.

       Andrew
Andrew Lunn June 11, 2024, 12:29 a.m. UTC | #3
On Tue, Jun 11, 2024 at 02:13:40AM +0200, Andrew Lunn wrote:
> On Mon, Jun 10, 2024 at 07:10:22PM -0400, Sean Anderson wrote:
> > Add support for reading the statistics counters, if they are enabled.
> > The counters may be 64-bit, but we can't detect this as there's no
> > ability bit for it and the counters are read-only. Therefore, we assume
> > the counters are 32-bits.
> 
> > +static void axienet_stats_update(struct axienet_local *lp)
> > +{
> > +	enum temac_stat stat;
> > +
> > +	lockdep_assert_held(&lp->stats_lock);
> > +
> > +	u64_stats_update_begin(&lp->hw_stat_sync);
> > +	for (stat = 0; stat < STAT_COUNT; stat++) {
> > +		u32 counter = axienet_ior(lp, XAE_STATS_OFFSET + stat * 8);
> 
> The * 8 here suggests the counters are spaced so that they could be 64
> bit wide, even when only 32 bits are used. Does the documentation say
> anything about the upper 32 bits when the counters are only 32 bits?
> Are they guaranteed to read as zero? I'm just wondering if the code
> should be forward looking and read all 64 bits? 

Actually, if you read the upper 32 bits and they are not 0, you know
you have 64 bit counters. You can then kill off your period task, it
is not needed because your software counters will wrap around the same
time as the hardware counters.

     Andrew
Sean Anderson June 11, 2024, 3:14 p.m. UTC | #4
On 6/10/24 20:13, Andrew Lunn wrote:
> On Mon, Jun 10, 2024 at 07:10:22PM -0400, Sean Anderson wrote:
>> Add support for reading the statistics counters, if they are enabled.
>> The counters may be 64-bit, but we can't detect this as there's no
>> ability bit for it and the counters are read-only. Therefore, we assume
>> the counters are 32-bits.
> 
>> +static void axienet_stats_update(struct axienet_local *lp)
>> +{
>> +	enum temac_stat stat;
>> +
>> +	lockdep_assert_held(&lp->stats_lock);
>> +
>> +	u64_stats_update_begin(&lp->hw_stat_sync);
>> +	for (stat = 0; stat < STAT_COUNT; stat++) {
>> +		u32 counter = axienet_ior(lp, XAE_STATS_OFFSET + stat * 8);
> 
> The * 8 here suggests the counters are spaced so that they could be 64
> bit wide, even when only 32 bits are used.

Correct.

> Does the documentation say anything about the upper 32 bits when the
> counters are only 32 bits?  Are they guaranteed to read as zero? I'm
> just wondering if the code should be forward looking and read all 64
> bits? 

The registers are documented as being 32-bit, with the upper 32-bits
being registered upon reading the lower 32 bits. The documentation
doesn't say what the upper registers are when the counters are 32 bits.

>>  static int __axienet_device_reset(struct axienet_local *lp)
>>  {
>>  	u32 value;
>>  	int ret;
>>  
>> +	/* Save statistics counters in case they will be reset */
>> +	if (lp->features & XAE_FEATURE_STATS) {
>> +		mutex_lock(&lp->stats_lock);
>> +		axienet_stats_update(lp);
>> +	}
> 
> It is a pretty unusual pattern to split a mutex lock/unlock like this
> on an if statement. Maybe just unconditionally hold the mutex? This
> does not appear to be anyway hot path, so the overhead should not
> matter.

OK

--Sean
Sean Anderson June 11, 2024, 3:36 p.m. UTC | #5
On 6/10/24 20:26, Andrew Lunn wrote:
>> +static u64 axienet_stat(struct axienet_local *lp, enum temac_stat stat)
>> +{
>> +	return u64_stats_read(&lp->hw_stats[stat]);
>> +}
>> @@ -1695,6 +1760,35 @@ axienet_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
>>  		stats->tx_packets = u64_stats_read(&lp->tx_packets);
>>  		stats->tx_bytes = u64_stats_read(&lp->tx_bytes);
>>  	} while (u64_stats_fetch_retry(&lp->tx_stat_sync, start));
>> +
>> +	if (!(lp->features & XAE_FEATURE_STATS))
>> +		return;
>> +
>> +	do {
>> +		start = u64_stats_fetch_begin(&lp->hw_stat_sync);
>> +		stats->rx_length_errors =
>> +			axienet_stat(lp, STAT_RX_LENGTH_ERRORS);
> 
> I'm i reading this correctly. You are returning the counters from the
> last refresh period. What is that? 2.5Gbps would wrapper around a 32
> byte counter in 13 seconds. I hope these statistics are not 13 seconds
> out of date?

By default we use a 1 Hz refresh period. You can of course configure this
up to 13 seconds, but we refuse to raise it further since we risk missing
a wrap-around. It's configurable by userspace so they can determine how
out-of-date they like their stats (vs how often they want to wake up the
CPU).

> Since axienet_stats_update() also uses the lp->hw_stat_sync, i don't
> see why you cannot read the hardware counter value and update to the
> latest value.

We would need to synchronize against updates to hw_last_counter. Imagine
a scenario like

CPU 1					CPU 2
__axienet_device_reset()
	axienet_stats_update()
					axienet_stat()
						u64_stats_read()
						axienet_ior()
	/* device reset */
	hw_last_counter = 0
						stats->foo = ... - hw_last_counter[...]

and now we have a glitch in the counter values, since we effectively are
double-counting the current counter value. Alternatively, we could read
the counter after reset but before hw_last_counter was updated and get a
glitch due to underflow.

--Sean
Sean Anderson June 11, 2024, 4:43 p.m. UTC | #6
On 6/10/24 20:29, Andrew Lunn wrote:
> On Tue, Jun 11, 2024 at 02:13:40AM +0200, Andrew Lunn wrote:
>> On Mon, Jun 10, 2024 at 07:10:22PM -0400, Sean Anderson wrote:
>> > Add support for reading the statistics counters, if they are enabled.
>> > The counters may be 64-bit, but we can't detect this as there's no
>> > ability bit for it and the counters are read-only. Therefore, we assume
>> > the counters are 32-bits.
>> 
>> > +static void axienet_stats_update(struct axienet_local *lp)
>> > +{
>> > +	enum temac_stat stat;
>> > +
>> > +	lockdep_assert_held(&lp->stats_lock);
>> > +
>> > +	u64_stats_update_begin(&lp->hw_stat_sync);
>> > +	for (stat = 0; stat < STAT_COUNT; stat++) {
>> > +		u32 counter = axienet_ior(lp, XAE_STATS_OFFSET + stat * 8);
>> 
>> The * 8 here suggests the counters are spaced so that they could be 64
>> bit wide, even when only 32 bits are used. Does the documentation say
>> anything about the upper 32 bits when the counters are only 32 bits?
>> Are they guaranteed to read as zero? I'm just wondering if the code
>> should be forward looking and read all 64 bits? 
> 
> Actually, if you read the upper 32 bits and they are not 0, you know
> you have 64 bit counters. You can then kill off your period task, it
> is not needed because your software counters will wrap around the same
> time as the hardware counters.

Yes, but then our stats remain stale forever, because we don't refresh
stats before reading them as detailed in my other response.

--Sean
Simon Horman June 14, 2024, 4:30 p.m. UTC | #7
On Mon, Jun 10, 2024 at 07:10:22PM -0400, Sean Anderson wrote:
> Add support for reading the statistics counters, if they are enabled.
> The counters may be 64-bit, but we can't detect this as there's no
> ability bit for it and the counters are read-only. Therefore, we assume
> the counters are 32-bits. To ensure we don't miss an overflow, we need
> to read all counters at regular intervals, configurable with
> stats-block-usecs. This should be often enough to ensure the bytes
> counters don't wrap at 2.5 Gbit/s.
> 
> Another complication is that the counters may be reset when the device
> is reset (depending on configuration). To ensure the counters persist
> across link up/down (including suspend/resume), we maintain our own
> 64-bit versions along with the last counter value we saw. Because we
> might wait up to 100 ms for the reset to complete, we use a mutex to
> protect writing hw_stats. We can't sleep in ndo_get_stats64, so we use a
> u64_stats_sync to protect readers.
> 
> We can't use the byte counters for either get_stats64 or
> get_eth_mac_stats. This is because the byte counters include everything
> in the frame (destination address to FCS, inclusive). But
> rtnl_link_stats64 wants bytes excluding the FCS, and
> ethtool_eth_mac_stats wants to exclude the L2 overhead (addresses and
> length/type).
> 
> Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
> ---
> 
>  drivers/net/ethernet/xilinx/xilinx_axienet.h  |  81 ++++++
>  .../net/ethernet/xilinx/xilinx_axienet_main.c | 267 +++++++++++++++++-
>  2 files changed, 345 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet.h b/drivers/net/ethernet/xilinx/xilinx_axienet.h

...

> @@ -434,6 +502,11 @@ struct skbuf_dma_descriptor {
>   * @tx_packets: TX packet count for statistics
>   * @tx_bytes:	TX byte count for statistics
>   * @tx_stat_sync: Synchronization object for TX stats
> + * @hw_last_counter: Last-seen value of each statistic
> + * @hw_stats: Interface statistics periodically updated from hardware counters
> + * @hw_stats_sync: Synchronization object for @hw_stats

nit: s/hw_stats_sync/hw_stat_sync/

     Flagged by kernel-doc -none

> + * @stats_lock: Lock for writing @hw_stats and @hw_last_counter
> + * @stats_work: Work for reading the hardware statistics counters
>   * @dma_err_task: Work structure to process Axi DMA errors
>   * @tx_irq:	Axidma TX IRQ number
>   * @rx_irq:	Axidma RX IRQ number
> @@ -452,6 +525,7 @@ struct skbuf_dma_descriptor {
>   * @coalesce_usec_rx:	IRQ coalesce delay for RX
>   * @coalesce_count_tx:	Store the irq coalesce on TX side.
>   * @coalesce_usec_tx:	IRQ coalesce delay for TX
> + * @coalesce_usec_stats: Delay between hardware statistics refreshes
>   * @use_dmaengine: flag to check dmaengine framework usage.
>   * @tx_chan:	TX DMA channel.
>   * @rx_chan:	RX DMA channel.
> @@ -505,6 +579,12 @@ struct axienet_local {
>  	u64_stats_t tx_bytes;
>  	struct u64_stats_sync tx_stat_sync;
>  
> +	u32 hw_last_counter[STAT_COUNT];
> +	u64_stats_t hw_stats[STAT_COUNT];
> +	struct u64_stats_sync hw_stat_sync;
> +	struct mutex stats_lock;
> +	struct delayed_work stats_work;
> +
>  	struct work_struct dma_err_task;
>  
>  	int tx_irq;

...
Sean Anderson June 14, 2024, 8:54 p.m. UTC | #8
On 6/14/24 12:30, Simon Horman wrote:
> On Mon, Jun 10, 2024 at 07:10:22PM -0400, Sean Anderson wrote:
>> Add support for reading the statistics counters, if they are enabled.
>> The counters may be 64-bit, but we can't detect this as there's no
>> ability bit for it and the counters are read-only. Therefore, we assume
>> the counters are 32-bits. To ensure we don't miss an overflow, we need
>> to read all counters at regular intervals, configurable with
>> stats-block-usecs. This should be often enough to ensure the bytes
>> counters don't wrap at 2.5 Gbit/s.
>> 
>> Another complication is that the counters may be reset when the device
>> is reset (depending on configuration). To ensure the counters persist
>> across link up/down (including suspend/resume), we maintain our own
>> 64-bit versions along with the last counter value we saw. Because we
>> might wait up to 100 ms for the reset to complete, we use a mutex to
>> protect writing hw_stats. We can't sleep in ndo_get_stats64, so we use a
>> u64_stats_sync to protect readers.
>> 
>> We can't use the byte counters for either get_stats64 or
>> get_eth_mac_stats. This is because the byte counters include everything
>> in the frame (destination address to FCS, inclusive). But
>> rtnl_link_stats64 wants bytes excluding the FCS, and
>> ethtool_eth_mac_stats wants to exclude the L2 overhead (addresses and
>> length/type).
>> 
>> Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
>> ---
>> 
>>  drivers/net/ethernet/xilinx/xilinx_axienet.h  |  81 ++++++
>>  .../net/ethernet/xilinx/xilinx_axienet_main.c | 267 +++++++++++++++++-
>>  2 files changed, 345 insertions(+), 3 deletions(-)
>> 
>> diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet.h b/drivers/net/ethernet/xilinx/xilinx_axienet.h
> 
> ...
> 
>> @@ -434,6 +502,11 @@ struct skbuf_dma_descriptor {
>>   * @tx_packets: TX packet count for statistics
>>   * @tx_bytes:	TX byte count for statistics
>>   * @tx_stat_sync: Synchronization object for TX stats
>> + * @hw_last_counter: Last-seen value of each statistic
>> + * @hw_stats: Interface statistics periodically updated from hardware counters
>> + * @hw_stats_sync: Synchronization object for @hw_stats
> 
> nit: s/hw_stats_sync/hw_stat_sync/
> 
>      Flagged by kernel-doc -none

I think I will rename the member instead.

--Sean

>> + * @stats_lock: Lock for writing @hw_stats and @hw_last_counter
>> + * @stats_work: Work for reading the hardware statistics counters
>>   * @dma_err_task: Work structure to process Axi DMA errors
>>   * @tx_irq:	Axidma TX IRQ number
>>   * @rx_irq:	Axidma RX IRQ number
>> @@ -452,6 +525,7 @@ struct skbuf_dma_descriptor {
>>   * @coalesce_usec_rx:	IRQ coalesce delay for RX
>>   * @coalesce_count_tx:	Store the irq coalesce on TX side.
>>   * @coalesce_usec_tx:	IRQ coalesce delay for TX
>> + * @coalesce_usec_stats: Delay between hardware statistics refreshes
>>   * @use_dmaengine: flag to check dmaengine framework usage.
>>   * @tx_chan:	TX DMA channel.
>>   * @rx_chan:	RX DMA channel.
>> @@ -505,6 +579,12 @@ struct axienet_local {
>>  	u64_stats_t tx_bytes;
>>  	struct u64_stats_sync tx_stat_sync;
>>  
>> +	u32 hw_last_counter[STAT_COUNT];
>> +	u64_stats_t hw_stats[STAT_COUNT];
>> +	struct u64_stats_sync hw_stat_sync;
>> +	struct mutex stats_lock;
>> +	struct delayed_work stats_work;
>> +
>>  	struct work_struct dma_err_task;
>>  
>>  	int tx_irq;
> 
> ...
Sean Anderson June 18, 2024, 5:03 p.m. UTC | #9
Hi Andrew,

On 6/11/24 11:36, Sean Anderson wrote:
> On 6/10/24 20:26, Andrew Lunn wrote:
>>> +static u64 axienet_stat(struct axienet_local *lp, enum temac_stat stat)
>>> +{
>>> +	return u64_stats_read(&lp->hw_stats[stat]);
>>> +}
>>> @@ -1695,6 +1760,35 @@ axienet_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
>>>  		stats->tx_packets = u64_stats_read(&lp->tx_packets);
>>>  		stats->tx_bytes = u64_stats_read(&lp->tx_bytes);
>>>  	} while (u64_stats_fetch_retry(&lp->tx_stat_sync, start));
>>> +
>>> +	if (!(lp->features & XAE_FEATURE_STATS))
>>> +		return;
>>> +
>>> +	do {
>>> +		start = u64_stats_fetch_begin(&lp->hw_stat_sync);
>>> +		stats->rx_length_errors =
>>> +			axienet_stat(lp, STAT_RX_LENGTH_ERRORS);
>> 
>> I'm i reading this correctly. You are returning the counters from the
>> last refresh period. What is that? 2.5Gbps would wrapper around a 32
>> byte counter in 13 seconds. I hope these statistics are not 13 seconds
>> out of date?
> 
> By default we use a 1 Hz refresh period. You can of course configure this
> up to 13 seconds, but we refuse to raise it further since we risk missing
> a wrap-around. It's configurable by userspace so they can determine how
> out-of-date they like their stats (vs how often they want to wake up the
> CPU).
> 
>> Since axienet_stats_update() also uses the lp->hw_stat_sync, i don't
>> see why you cannot read the hardware counter value and update to the
>> latest value.
> 
> We would need to synchronize against updates to hw_last_counter. Imagine
> a scenario like
> 
> CPU 1					CPU 2
> __axienet_device_reset()
> 	axienet_stats_update()
> 					axienet_stat()
> 						u64_stats_read()
> 						axienet_ior()
> 	/* device reset */
> 	hw_last_counter = 0
> 						stats->foo = ... - hw_last_counter[...]
> 
> and now we have a glitch in the counter values, since we effectively are
> double-counting the current counter value. Alternatively, we could read
> the counter after reset but before hw_last_counter was updated and get a
> glitch due to underflow.

Does this make sense to you? If it does, I'll send v2 with just the mutex
change and the variable rename pointed out by Simon.

--Sean
Andrew Lunn June 18, 2024, 6:19 p.m. UTC | #10
On Tue, Jun 18, 2024 at 01:03:47PM -0400, Sean Anderson wrote:
> Hi Andrew,
> 
> On 6/11/24 11:36, Sean Anderson wrote:
> > On 6/10/24 20:26, Andrew Lunn wrote:
> >>> +static u64 axienet_stat(struct axienet_local *lp, enum temac_stat stat)
> >>> +{
> >>> +	return u64_stats_read(&lp->hw_stats[stat]);
> >>> +}
> >>> @@ -1695,6 +1760,35 @@ axienet_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
> >>>  		stats->tx_packets = u64_stats_read(&lp->tx_packets);
> >>>  		stats->tx_bytes = u64_stats_read(&lp->tx_bytes);
> >>>  	} while (u64_stats_fetch_retry(&lp->tx_stat_sync, start));
> >>> +
> >>> +	if (!(lp->features & XAE_FEATURE_STATS))
> >>> +		return;
> >>> +
> >>> +	do {
> >>> +		start = u64_stats_fetch_begin(&lp->hw_stat_sync);
> >>> +		stats->rx_length_errors =
> >>> +			axienet_stat(lp, STAT_RX_LENGTH_ERRORS);
> >> 
> >> I'm i reading this correctly. You are returning the counters from the
> >> last refresh period. What is that? 2.5Gbps would wrapper around a 32
> >> byte counter in 13 seconds. I hope these statistics are not 13 seconds
> >> out of date?
> > 
> > By default we use a 1 Hz refresh period. You can of course configure this
> > up to 13 seconds, but we refuse to raise it further since we risk missing
> > a wrap-around. It's configurable by userspace so they can determine how
> > out-of-date they like their stats (vs how often they want to wake up the
> > CPU).
> > 
> >> Since axienet_stats_update() also uses the lp->hw_stat_sync, i don't
> >> see why you cannot read the hardware counter value and update to the
> >> latest value.
> > 
> > We would need to synchronize against updates to hw_last_counter. Imagine
> > a scenario like
> > 
> > CPU 1					CPU 2
> > __axienet_device_reset()
> > 	axienet_stats_update()
> > 					axienet_stat()
> > 						u64_stats_read()
> > 						axienet_ior()
> > 	/* device reset */
> > 	hw_last_counter = 0
> > 						stats->foo = ... - hw_last_counter[...]
> > 
> > and now we have a glitch in the counter values, since we effectively are
> > double-counting the current counter value. Alternatively, we could read
> > the counter after reset but before hw_last_counter was updated and get a
> > glitch due to underflow.
> 
> Does this make sense to you? If it does, I'll send v2 with just the mutex
> change and the variable rename pointed out by Simon.

What you have is O.K. I just think you can do better.

As you point out, there is a potential race. There are a few
synchronisation mechanisms for that.

Often you arrange to have exclusive access to a data structure, so you
know it cannot change while you use it. But as you pointed out, you
are not in a context which can block on a mutex.

Another mechanism is to know if the data structure has changed while
you where using it. If it has, throw away what you have, and start
again. That is what lp->hw_stat_sync etc is all about. You loop while
its value changes, indicating something made changes. You should be
able to use this when returning statistics to user space, to return
the real current statistics, not old cached values.

    Andrew
diff mbox series

Patch

diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet.h b/drivers/net/ethernet/xilinx/xilinx_axienet.h
index fa5500decc96..c121e9b1c41d 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet.h
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet.h
@@ -156,6 +156,7 @@ 
 #define XAE_TPID0_OFFSET	0x00000028 /* VLAN TPID0 register */
 #define XAE_TPID1_OFFSET	0x0000002C /* VLAN TPID1 register */
 #define XAE_PPST_OFFSET		0x00000030 /* PCS PMA Soft Temac Status Reg */
+#define XAE_STATS_OFFSET	0x00000200 /* Statistics counters */
 #define XAE_RCW0_OFFSET		0x00000400 /* Rx Configuration Word 0 */
 #define XAE_RCW1_OFFSET		0x00000404 /* Rx Configuration Word 1 */
 #define XAE_TC_OFFSET		0x00000408 /* Tx Configuration */
@@ -163,6 +164,7 @@ 
 #define XAE_EMMC_OFFSET		0x00000410 /* EMAC mode configuration */
 #define XAE_PHYC_OFFSET		0x00000414 /* RGMII/SGMII configuration */
 #define XAE_ID_OFFSET		0x000004F8 /* Identification register */
+#define XAE_ABILITY_OFFSET	0x000004FC /* Ability Register offset */
 #define XAE_MDIO_MC_OFFSET	0x00000500 /* MII Management Config */
 #define XAE_MDIO_MCR_OFFSET	0x00000504 /* MII Management Control */
 #define XAE_MDIO_MWD_OFFSET	0x00000508 /* MII Management Write Data */
@@ -283,6 +285,16 @@ 
 #define XAE_PHYC_SGLINKSPD_100		0x40000000 /* SGMII link 100 Mbit */
 #define XAE_PHYC_SGLINKSPD_1000		0x80000000 /* SGMII link 1000 Mbit */
 
+/* Bit masks for Axi Ethernet ability register */
+#define XAE_ABILITY_PFC			BIT(16)
+#define XAE_ABILITY_FRAME_FILTER	BIT(10)
+#define XAE_ABILITY_HALF_DUPLEX		BIT(9)
+#define XAE_ABILITY_STATS		BIT(8)
+#define XAE_ABILITY_2_5G		BIT(3)
+#define XAE_ABILITY_1G			BIT(2)
+#define XAE_ABILITY_100M		BIT(1)
+#define XAE_ABILITY_10M			BIT(0)
+
 /* Bit masks for Axi Ethernet MDIO interface MC register */
 #define XAE_MDIO_MC_MDIOEN_MASK		0x00000040 /* MII management enable */
 #define XAE_MDIO_MC_CLOCK_DIVIDE_MAX	0x3F	   /* Maximum MDIO divisor */
@@ -331,6 +343,7 @@ 
 #define XAE_FEATURE_FULL_RX_CSUM	(1 << 2)
 #define XAE_FEATURE_FULL_TX_CSUM	(1 << 3)
 #define XAE_FEATURE_DMA_64BIT		(1 << 4)
+#define XAE_FEATURE_STATS		(1 << 5)
 
 #define XAE_NO_CSUM_OFFLOAD		0
 
@@ -344,6 +357,61 @@ 
 #define XLNX_MII_STD_SELECT_REG		0x11
 #define XLNX_MII_STD_SELECT_SGMII	BIT(0)
 
+/* enum temac_stat - TEMAC statistics counters
+ *
+ * Index of statistics counters within the TEMAC. This must match the
+ * order/offset of hardware registers exactly.
+ */
+enum temac_stat {
+	STAT_RX_BYTES = 0,
+	STAT_TX_BYTES,
+	STAT_UNDERSIZE_FRAMES,
+	STAT_FRAGMENT_FRAMES,
+	STAT_RX_64_BYTE_FRAMES,
+	STAT_RX_65_127_BYTE_FRAMES,
+	STAT_RX_128_255_BYTE_FRAMES,
+	STAT_RX_256_511_BYTE_FRAMES,
+	STAT_RX_512_1023_BYTE_FRAMES,
+	STAT_RX_1024_MAX_BYTE_FRAMES,
+	STAT_RX_OVERSIZE_FRAMES,
+	STAT_TX_64_BYTE_FRAMES,
+	STAT_TX_65_127_BYTE_FRAMES,
+	STAT_TX_128_255_BYTE_FRAMES,
+	STAT_TX_256_511_BYTE_FRAMES,
+	STAT_TX_512_1023_BYTE_FRAMES,
+	STAT_TX_1024_MAX_BYTE_FRAMES,
+	STAT_TX_OVERSIZE_FRAMES,
+	STAT_RX_GOOD_FRAMES,
+	STAT_RX_FCS_ERRORS,
+	STAT_RX_BROADCAST_FRAMES,
+	STAT_RX_MULTICAST_FRAMES,
+	STAT_RX_CONTROL_FRAMES,
+	STAT_RX_LENGTH_ERRORS,
+	STAT_RX_VLAN_FRAMES,
+	STAT_RX_PAUSE_FRAMES,
+	STAT_RX_CONTROL_OPCODE_ERRORS,
+	STAT_TX_GOOD_FRAMES,
+	STAT_TX_BROADCAST_FRAMES,
+	STAT_TX_MULTICAST_FRAMES,
+	STAT_TX_UNDERRUN_ERRORS,
+	STAT_TX_CONTROL_FRAMES,
+	STAT_TX_VLAN_FRAMES,
+	STAT_TX_PAUSE_FRAMES,
+	STAT_TX_SINGLE_COLLISION_FRAMES,
+	STAT_TX_MULTIPLE_COLLISION_FRAMES,
+	STAT_TX_DEFERRED_FRAMES,
+	STAT_TX_LATE_COLLISIONS,
+	STAT_TX_EXCESS_COLLISIONS,
+	STAT_TX_EXCESS_DEFERRAL,
+	STAT_RX_ALIGNMENT_ERRORS,
+	STAT_TX_PFC_FRAMES,
+	STAT_RX_PFC_FRAMES,
+	STAT_USER_DEFINED0,
+	STAT_USER_DEFINED1,
+	STAT_USER_DEFINED2,
+	STAT_COUNT,
+};
+
 /**
  * struct axidma_bd - Axi Dma buffer descriptor layout
  * @next:         MM2S/S2MM Next Descriptor Pointer
@@ -434,6 +502,11 @@  struct skbuf_dma_descriptor {
  * @tx_packets: TX packet count for statistics
  * @tx_bytes:	TX byte count for statistics
  * @tx_stat_sync: Synchronization object for TX stats
+ * @hw_last_counter: Last-seen value of each statistic
+ * @hw_stats: Interface statistics periodically updated from hardware counters
+ * @hw_stats_sync: Synchronization object for @hw_stats
+ * @stats_lock: Lock for writing @hw_stats and @hw_last_counter
+ * @stats_work: Work for reading the hardware statistics counters
  * @dma_err_task: Work structure to process Axi DMA errors
  * @tx_irq:	Axidma TX IRQ number
  * @rx_irq:	Axidma RX IRQ number
@@ -452,6 +525,7 @@  struct skbuf_dma_descriptor {
  * @coalesce_usec_rx:	IRQ coalesce delay for RX
  * @coalesce_count_tx:	Store the irq coalesce on TX side.
  * @coalesce_usec_tx:	IRQ coalesce delay for TX
+ * @coalesce_usec_stats: Delay between hardware statistics refreshes
  * @use_dmaengine: flag to check dmaengine framework usage.
  * @tx_chan:	TX DMA channel.
  * @rx_chan:	RX DMA channel.
@@ -505,6 +579,12 @@  struct axienet_local {
 	u64_stats_t tx_bytes;
 	struct u64_stats_sync tx_stat_sync;
 
+	u32 hw_last_counter[STAT_COUNT];
+	u64_stats_t hw_stats[STAT_COUNT];
+	struct u64_stats_sync hw_stat_sync;
+	struct mutex stats_lock;
+	struct delayed_work stats_work;
+
 	struct work_struct dma_err_task;
 
 	int tx_irq;
@@ -525,6 +605,7 @@  struct axienet_local {
 	u32 coalesce_usec_rx;
 	u32 coalesce_count_tx;
 	u32 coalesce_usec_tx;
+	u32 coalesce_usec_stats;
 	u8  use_dmaengine;
 	struct dma_chan *tx_chan;
 	struct dma_chan *rx_chan;
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index cf8908794409..0ec1dc6dde3f 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -518,11 +518,53 @@  static void axienet_setoptions(struct net_device *ndev, u32 options)
 	lp->options |= options;
 }
 
+static u64 axienet_stat(struct axienet_local *lp, enum temac_stat stat)
+{
+	return u64_stats_read(&lp->hw_stats[stat]);
+}
+
+static void axienet_stats_update(struct axienet_local *lp)
+{
+	enum temac_stat stat;
+
+	lockdep_assert_held(&lp->stats_lock);
+
+	u64_stats_update_begin(&lp->hw_stat_sync);
+	for (stat = 0; stat < STAT_COUNT; stat++) {
+		u32 counter = axienet_ior(lp, XAE_STATS_OFFSET + stat * 8);
+
+		u64_stats_add(&lp->hw_stats[stat],
+			      counter - lp->hw_last_counter[stat]);
+		lp->hw_last_counter[stat] = counter;
+	}
+	u64_stats_update_end(&lp->hw_stat_sync);
+}
+
+static void axienet_refresh_stats(struct work_struct *work)
+{
+	struct axienet_local *lp = container_of(work, struct axienet_local,
+						stats_work.work);
+
+	mutex_lock(&lp->stats_lock);
+	axienet_stats_update(lp);
+	mutex_unlock(&lp->stats_lock);
+
+	if (lp->coalesce_usec_stats)
+		schedule_delayed_work(&lp->stats_work,
+				      usecs_to_jiffies(lp->coalesce_usec_stats));
+}
+
 static int __axienet_device_reset(struct axienet_local *lp)
 {
 	u32 value;
 	int ret;
 
+	/* Save statistics counters in case they will be reset */
+	if (lp->features & XAE_FEATURE_STATS) {
+		mutex_lock(&lp->stats_lock);
+		axienet_stats_update(lp);
+	}
+
 	/* Reset Axi DMA. This would reset Axi Ethernet core as well. The reset
 	 * process of Axi DMA takes a while to complete as all pending
 	 * commands/transfers will be flushed or completed during this
@@ -537,7 +579,7 @@  static int __axienet_device_reset(struct axienet_local *lp)
 				XAXIDMA_TX_CR_OFFSET);
 	if (ret) {
 		dev_err(lp->dev, "%s: DMA reset timeout!\n", __func__);
-		return ret;
+		goto err;
 	}
 
 	/* Wait for PhyRstCmplt bit to be set, indicating the PHY reset has finished */
@@ -547,10 +589,25 @@  static int __axienet_device_reset(struct axienet_local *lp)
 				XAE_IS_OFFSET);
 	if (ret) {
 		dev_err(lp->dev, "%s: timeout waiting for PhyRstCmplt\n", __func__);
-		return ret;
+		goto err;
+	}
+
+	/* Update statistics counters with new values */
+	if (lp->features & XAE_FEATURE_STATS) {
+		enum temac_stat stat;
+
+		for (stat = 0; stat < STAT_COUNT; stat++)
+			lp->hw_last_counter[stat] =
+				axienet_ior(lp, XAE_STATS_OFFSET + stat * 8);
+		mutex_unlock(&lp->stats_lock);
 	}
 
 	return 0;
+
+err:
+	if (lp->features & XAE_FEATURE_STATS)
+		mutex_unlock(&lp->stats_lock);
+	return ret;
 }
 
 /**
@@ -1532,6 +1589,11 @@  static int axienet_open(struct net_device *ndev)
 
 	phylink_start(lp->phylink);
 
+	/* Start the statistics refresh work */
+	if (lp->coalesce_usec_stats)
+		schedule_delayed_work(&lp->stats_work,
+				      usecs_to_jiffies(lp->coalesce_usec_stats));
+
 	if (lp->use_dmaengine) {
 		/* Enable interrupts for Axi Ethernet core (if defined) */
 		if (lp->eth_irq > 0) {
@@ -1556,6 +1618,7 @@  static int axienet_open(struct net_device *ndev)
 	if (lp->eth_irq > 0)
 		free_irq(lp->eth_irq, ndev);
 err_phy:
+	cancel_delayed_work_sync(&lp->stats_work);
 	phylink_stop(lp->phylink);
 	phylink_disconnect_phy(lp->phylink);
 	return ret;
@@ -1583,6 +1646,8 @@  static int axienet_stop(struct net_device *ndev)
 		napi_disable(&lp->napi_rx);
 	}
 
+	cancel_delayed_work_sync(&lp->stats_work);
+
 	phylink_stop(lp->phylink);
 	phylink_disconnect_phy(lp->phylink);
 
@@ -1695,6 +1760,35 @@  axienet_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
 		stats->tx_packets = u64_stats_read(&lp->tx_packets);
 		stats->tx_bytes = u64_stats_read(&lp->tx_bytes);
 	} while (u64_stats_fetch_retry(&lp->tx_stat_sync, start));
+
+	if (!(lp->features & XAE_FEATURE_STATS))
+		return;
+
+	do {
+		start = u64_stats_fetch_begin(&lp->hw_stat_sync);
+		stats->rx_length_errors =
+			axienet_stat(lp, STAT_RX_LENGTH_ERRORS);
+		stats->rx_crc_errors = axienet_stat(lp, STAT_RX_FCS_ERRORS);
+		stats->rx_frame_errors =
+			axienet_stat(lp, STAT_RX_ALIGNMENT_ERRORS);
+		stats->rx_errors = axienet_stat(lp, STAT_UNDERSIZE_FRAMES) +
+				   axienet_stat(lp, STAT_FRAGMENT_FRAMES) +
+				   stats->rx_length_errors +
+				   stats->rx_crc_errors +
+				   stats->rx_frame_errors;
+		stats->multicast = axienet_stat(lp, STAT_RX_MULTICAST_FRAMES);
+
+		stats->tx_aborted_errors =
+			axienet_stat(lp, STAT_TX_EXCESS_COLLISIONS);
+		stats->tx_fifo_errors =
+			axienet_stat(lp, STAT_TX_UNDERRUN_ERRORS);
+		stats->tx_window_errors =
+			axienet_stat(lp, STAT_TX_LATE_COLLISIONS);
+		stats->tx_errors = axienet_stat(lp, STAT_TX_EXCESS_DEFERRAL) +
+				   stats->tx_aborted_errors +
+				   stats->tx_fifo_errors +
+				   stats->tx_window_errors;
+	} while (u64_stats_fetch_retry(&lp->hw_stat_sync, start));
 }
 
 static const struct net_device_ops axienet_netdev_ops = {
@@ -1920,6 +2014,7 @@  axienet_ethtools_get_coalesce(struct net_device *ndev,
 	ecoalesce->rx_coalesce_usecs = lp->coalesce_usec_rx;
 	ecoalesce->tx_max_coalesced_frames = lp->coalesce_count_tx;
 	ecoalesce->tx_coalesce_usecs = lp->coalesce_usec_tx;
+	ecoalesce->stats_block_coalesce_usecs = lp->coalesce_usec_stats;
 	return 0;
 }
 
@@ -1958,6 +2053,9 @@  axienet_ethtools_set_coalesce(struct net_device *ndev,
 		lp->coalesce_count_tx = ecoalesce->tx_max_coalesced_frames;
 	if (ecoalesce->tx_coalesce_usecs)
 		lp->coalesce_usec_tx = ecoalesce->tx_coalesce_usecs;
+	/* Just less than 2^32 bytes at 2.5 GBit/s */
+	lp->coalesce_usec_stats = min(ecoalesce->stats_block_coalesce_usecs,
+				      13 * USEC_PER_SEC);
 
 	return 0;
 }
@@ -1987,9 +2085,160 @@  static int axienet_ethtools_nway_reset(struct net_device *dev)
 	return phylink_ethtool_nway_reset(lp->phylink);
 }
 
+static void
+axienet_ethtools_get_pause_stats(struct net_device *dev,
+				 struct ethtool_pause_stats *pause_stats)
+{
+	struct axienet_local *lp = netdev_priv(dev);
+	unsigned int start;
+
+	if (!(lp->features & XAE_FEATURE_STATS))
+		return;
+
+	do {
+		start = u64_stats_fetch_begin(&lp->hw_stat_sync);
+		pause_stats->tx_pause_frames =
+			axienet_stat(lp, STAT_TX_PAUSE_FRAMES);
+		pause_stats->rx_pause_frames =
+			axienet_stat(lp, STAT_RX_PAUSE_FRAMES);
+	} while (u64_stats_fetch_retry(&lp->hw_stat_sync, start));
+}
+
+static void
+axienet_ethtool_get_eth_mac_stats(struct net_device *dev,
+				  struct ethtool_eth_mac_stats *mac_stats)
+{
+	struct axienet_local *lp = netdev_priv(dev);
+	unsigned int start;
+
+	if (!(lp->features & XAE_FEATURE_STATS))
+		return;
+
+	do {
+		start = u64_stats_fetch_begin(&lp->hw_stat_sync);
+		mac_stats->FramesTransmittedOK =
+			axienet_stat(lp, STAT_TX_GOOD_FRAMES);
+		mac_stats->SingleCollisionFrames =
+			axienet_stat(lp, STAT_TX_SINGLE_COLLISION_FRAMES);
+		mac_stats->MultipleCollisionFrames =
+			axienet_stat(lp, STAT_TX_MULTIPLE_COLLISION_FRAMES);
+		mac_stats->FramesReceivedOK =
+			axienet_stat(lp, STAT_RX_GOOD_FRAMES);
+		mac_stats->FrameCheckSequenceErrors =
+			axienet_stat(lp, STAT_RX_FCS_ERRORS);
+		mac_stats->AlignmentErrors =
+			axienet_stat(lp, STAT_RX_ALIGNMENT_ERRORS);
+		mac_stats->FramesWithDeferredXmissions =
+			axienet_stat(lp, STAT_TX_DEFERRED_FRAMES);
+		mac_stats->LateCollisions =
+			axienet_stat(lp, STAT_TX_LATE_COLLISIONS);
+		mac_stats->FramesAbortedDueToXSColls =
+			axienet_stat(lp, STAT_TX_EXCESS_COLLISIONS);
+		mac_stats->MulticastFramesXmittedOK =
+			axienet_stat(lp, STAT_TX_MULTICAST_FRAMES);
+		mac_stats->BroadcastFramesXmittedOK =
+			axienet_stat(lp, STAT_TX_BROADCAST_FRAMES);
+		mac_stats->FramesWithExcessiveDeferral =
+			axienet_stat(lp, STAT_TX_EXCESS_DEFERRAL);
+		mac_stats->MulticastFramesReceivedOK =
+			axienet_stat(lp, STAT_RX_MULTICAST_FRAMES);
+		mac_stats->BroadcastFramesReceivedOK =
+			axienet_stat(lp, STAT_RX_BROADCAST_FRAMES);
+		mac_stats->InRangeLengthErrors =
+			axienet_stat(lp, STAT_RX_LENGTH_ERRORS);
+	} while (u64_stats_fetch_retry(&lp->hw_stat_sync, start));
+}
+
+static void
+axienet_ethtool_get_eth_ctrl_stats(struct net_device *dev,
+				   struct ethtool_eth_ctrl_stats *ctrl_stats)
+{
+	struct axienet_local *lp = netdev_priv(dev);
+	unsigned int start;
+
+	if (!(lp->features & XAE_FEATURE_STATS))
+		return;
+
+	do {
+		start = u64_stats_fetch_begin(&lp->hw_stat_sync);
+		ctrl_stats->MACControlFramesTransmitted =
+			axienet_stat(lp, STAT_TX_CONTROL_FRAMES);
+		ctrl_stats->MACControlFramesReceived =
+			axienet_stat(lp, STAT_RX_CONTROL_FRAMES);
+		ctrl_stats->UnsupportedOpcodesReceived =
+			axienet_stat(lp, STAT_RX_CONTROL_OPCODE_ERRORS);
+	} while (u64_stats_fetch_retry(&lp->hw_stat_sync, start));
+}
+
+static const struct ethtool_rmon_hist_range axienet_rmon_ranges[] = {
+	{   64,    64 },
+	{   65,   127 },
+	{  128,   255 },
+	{  256,   511 },
+	{  512,  1023 },
+	{ 1024,  1518 },
+	{ 1519, 16384 },
+	{ },
+};
+
+static void
+axienet_ethtool_get_rmon_stats(struct net_device *dev,
+			       struct ethtool_rmon_stats *rmon_stats,
+			       const struct ethtool_rmon_hist_range **ranges)
+{
+	struct axienet_local *lp = netdev_priv(dev);
+	unsigned int start;
+
+	if (!(lp->features & XAE_FEATURE_STATS))
+		return;
+
+	do {
+		start = u64_stats_fetch_begin(&lp->hw_stat_sync);
+		rmon_stats->undersize_pkts =
+			axienet_stat(lp, STAT_UNDERSIZE_FRAMES);
+		rmon_stats->oversize_pkts =
+			axienet_stat(lp, STAT_RX_OVERSIZE_FRAMES);
+		rmon_stats->fragments =
+			axienet_stat(lp, STAT_FRAGMENT_FRAMES);
+
+		rmon_stats->hist[0] =
+			axienet_stat(lp, STAT_RX_64_BYTE_FRAMES);
+		rmon_stats->hist[1] =
+			axienet_stat(lp, STAT_RX_65_127_BYTE_FRAMES);
+		rmon_stats->hist[2] =
+			axienet_stat(lp, STAT_RX_128_255_BYTE_FRAMES);
+		rmon_stats->hist[3] =
+			axienet_stat(lp, STAT_RX_256_511_BYTE_FRAMES);
+		rmon_stats->hist[4] =
+			axienet_stat(lp, STAT_RX_512_1023_BYTE_FRAMES);
+		rmon_stats->hist[5] =
+			axienet_stat(lp, STAT_RX_1024_MAX_BYTE_FRAMES);
+		rmon_stats->hist[6] =
+			rmon_stats->oversize_pkts;
+
+		rmon_stats->hist_tx[0] =
+			axienet_stat(lp, STAT_TX_64_BYTE_FRAMES);
+		rmon_stats->hist_tx[1] =
+			axienet_stat(lp, STAT_TX_65_127_BYTE_FRAMES);
+		rmon_stats->hist_tx[2] =
+			axienet_stat(lp, STAT_TX_128_255_BYTE_FRAMES);
+		rmon_stats->hist_tx[3] =
+			axienet_stat(lp, STAT_TX_256_511_BYTE_FRAMES);
+		rmon_stats->hist_tx[4] =
+			axienet_stat(lp, STAT_TX_512_1023_BYTE_FRAMES);
+		rmon_stats->hist_tx[5] =
+			axienet_stat(lp, STAT_TX_1024_MAX_BYTE_FRAMES);
+		rmon_stats->hist_tx[6] =
+			axienet_stat(lp, STAT_TX_OVERSIZE_FRAMES);
+	} while (u64_stats_fetch_retry(&lp->hw_stat_sync, start));
+
+	*ranges = axienet_rmon_ranges;
+}
+
 static const struct ethtool_ops axienet_ethtool_ops = {
 	.supported_coalesce_params = ETHTOOL_COALESCE_MAX_FRAMES |
-				     ETHTOOL_COALESCE_USECS,
+				     ETHTOOL_COALESCE_USECS |
+				     ETHTOOL_COALESCE_STATS_BLOCK_USECS,
 	.get_drvinfo    = axienet_ethtools_get_drvinfo,
 	.get_regs_len   = axienet_ethtools_get_regs_len,
 	.get_regs       = axienet_ethtools_get_regs,
@@ -2003,6 +2252,10 @@  static const struct ethtool_ops axienet_ethtool_ops = {
 	.get_link_ksettings = axienet_ethtools_get_link_ksettings,
 	.set_link_ksettings = axienet_ethtools_set_link_ksettings,
 	.nway_reset	= axienet_ethtools_nway_reset,
+	.get_pause_stats = axienet_ethtools_get_pause_stats,
+	.get_eth_mac_stats = axienet_ethtool_get_eth_mac_stats,
+	.get_eth_ctrl_stats = axienet_ethtool_get_eth_ctrl_stats,
+	.get_rmon_stats = axienet_ethtool_get_rmon_stats,
 };
 
 static struct axienet_local *pcs_to_axienet_local(struct phylink_pcs *pcs)
@@ -2271,6 +2524,10 @@  static int axienet_probe(struct platform_device *pdev)
 
 	u64_stats_init(&lp->rx_stat_sync);
 	u64_stats_init(&lp->tx_stat_sync);
+	u64_stats_init(&lp->hw_stat_sync);
+
+	mutex_init(&lp->stats_lock);
+	INIT_DEFERRABLE_WORK(&lp->stats_work, axienet_refresh_stats);
 
 	lp->axi_clk = devm_clk_get_optional(&pdev->dev, "s_axi_lite_clk");
 	if (!lp->axi_clk) {
@@ -2312,6 +2569,9 @@  static int axienet_probe(struct platform_device *pdev)
 	/* Setup checksum offload, but default to off if not specified */
 	lp->features = 0;
 
+	if (axienet_ior(lp, XAE_ABILITY_OFFSET) & XAE_ABILITY_STATS)
+		lp->features |= XAE_FEATURE_STATS;
+
 	ret = of_property_read_u32(pdev->dev.of_node, "xlnx,txcsum", &value);
 	if (!ret) {
 		switch (value) {
@@ -2527,6 +2787,7 @@  static int axienet_probe(struct platform_device *pdev)
 	lp->coalesce_count_tx = XAXIDMA_DFT_TX_THRESHOLD;
 	lp->coalesce_usec_rx = XAXIDMA_DFT_RX_USEC;
 	lp->coalesce_usec_tx = XAXIDMA_DFT_TX_USEC;
+	lp->coalesce_usec_stats = USEC_PER_SEC;
 
 	ret = axienet_mdio_setup(lp);
 	if (ret)