diff mbox series

[net] net: stmmac: xgmac: fix handling of DPP safety error for DMA channels

Message ID 20240123095006.979437-1-0x1207@gmail.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [net] net: stmmac: xgmac: fix handling of DPP safety error for DMA channels | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success SINGLE THREAD; Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1080 this patch: 1080
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 0 of 0 maintainers
netdev/build_clang success Errors and warnings before: 1095 this patch: 1095
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 1098 this patch: 1098
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 19 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-01-25--06-00 (tests: 521)

Commit Message

Furong Xu Jan. 23, 2024, 9:50 a.m. UTC
Commit 56e58d6c8a56 ("net: stmmac: Implement Safety Features in
XGMAC core") checks and reports safety errors, but leaves the
Data Path Parity Errors for each channel in DMA unhandled at all, lead to
a storm of interrupt.
Fix it by checking and clearing the DMA_DPP_Interrupt_Status register.

Fixes: 56e58d6c8a56 ("net: stmmac: Implement Safety Features in XGMAC core")
Signed-off-by: Furong Xu <0x1207@gmail.com>
---
 drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h      | 1 +
 drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c | 6 ++++++
 2 files changed, 7 insertions(+)

Comments

Serge Semin Jan. 24, 2024, 2:36 p.m. UTC | #1
On Tue, Jan 23, 2024 at 05:50:06PM +0800, Furong Xu wrote:
> Commit 56e58d6c8a56 ("net: stmmac: Implement Safety Features in
> XGMAC core") checks and reports safety errors, but leaves the
> Data Path Parity Errors for each channel in DMA unhandled at all, lead to
> a storm of interrupt.
> Fix it by checking and clearing the DMA_DPP_Interrupt_Status register.
> 
> Fixes: 56e58d6c8a56 ("net: stmmac: Implement Safety Features in XGMAC core")
> Signed-off-by: Furong Xu <0x1207@gmail.com>
> ---
>  drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h      | 1 +
>  drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c | 6 ++++++
>  2 files changed, 7 insertions(+)
> 
> diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h
> index 207ff1799f2c..188e11683136 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h
> +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h
> @@ -385,6 +385,7 @@
>  #define XGMAC_DCEIE			BIT(1)
>  #define XGMAC_TCEIE			BIT(0)
>  #define XGMAC_DMA_ECC_INT_STATUS	0x0000306c
> +#define XGMAC_DMA_DPP_INT_STATUS	0x00003074
>  #define XGMAC_DMA_CH_CONTROL(x)		(0x00003100 + (0x80 * (x)))
>  #define XGMAC_SPH			BIT(24)
>  #define XGMAC_PBLx8			BIT(16)
> diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c
> index eb48211d9b0e..874e85b499e2 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c
> @@ -745,6 +745,12 @@ static void dwxgmac3_handle_mac_err(struct net_device *ndev,
>  
>  	dwxgmac3_log_error(ndev, value, correctable, "MAC",
>  			   dwxgmac3_mac_errors, STAT_OFF(mac_errors), stats);
> +
> +	value = readl(ioaddr + XGMAC_DMA_DPP_INT_STATUS);
> +	writel(value, ioaddr + XGMAC_DMA_DPP_INT_STATUS);
> +
> +	if (value)
> +		netdev_err(ndev, "Found DMA_DPP error, status: 0x%x\n", value);

1. Why not to implement this in the same way as the rest of the safety
errors handle code? (with the flags described by the
dwxgmac3_error_desc-based table and the respective counters being
incremented should the errors were detected)

2. I don't see this IRQ being enabled in the dwxgmac3_safety_feat_config()
method. How come the respective event has turned to be triggered
anyway?

-Serge(y) 

>  }
>  
>  static const struct dwxgmac3_error_desc dwxgmac3_mtl_errors[32]= {
> -- 
> 2.34.1
> 
>
Furong Xu Jan. 25, 2024, 2:56 a.m. UTC | #2
On Wed, 24 Jan 2024 17:36:10 +0300
Serge Semin <fancer.lancer@gmail.com> wrote:

> On Tue, Jan 23, 2024 at 05:50:06PM +0800, Furong Xu wrote:
> > Commit 56e58d6c8a56 ("net: stmmac: Implement Safety Features in
> > XGMAC core") checks and reports safety errors, but leaves the
> > Data Path Parity Errors for each channel in DMA unhandled at all, lead to
> > a storm of interrupt.
> > Fix it by checking and clearing the DMA_DPP_Interrupt_Status register.
> > 
> > Fixes: 56e58d6c8a56 ("net: stmmac: Implement Safety Features in XGMAC core")
> > Signed-off-by: Furong Xu <0x1207@gmail.com>
> > ---
> >  drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h      | 1 +
> >  drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c | 6 ++++++
> >  2 files changed, 7 insertions(+)
> > 
> > diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h
> > index 207ff1799f2c..188e11683136 100644
> > --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h
> > +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h
> > @@ -385,6 +385,7 @@
> >  #define XGMAC_DCEIE			BIT(1)
> >  #define XGMAC_TCEIE			BIT(0)
> >  #define XGMAC_DMA_ECC_INT_STATUS	0x0000306c
> > +#define XGMAC_DMA_DPP_INT_STATUS	0x00003074
> >  #define XGMAC_DMA_CH_CONTROL(x)		(0x00003100 + (0x80 * (x)))
> >  #define XGMAC_SPH			BIT(24)
> >  #define XGMAC_PBLx8			BIT(16)
> > diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c
> > index eb48211d9b0e..874e85b499e2 100644
> > --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c
> > +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c
> > @@ -745,6 +745,12 @@ static void dwxgmac3_handle_mac_err(struct net_device *ndev,
> >  
> >  	dwxgmac3_log_error(ndev, value, correctable, "MAC",
> >  			   dwxgmac3_mac_errors, STAT_OFF(mac_errors), stats);
> > +
> > +	value = readl(ioaddr + XGMAC_DMA_DPP_INT_STATUS);
> > +	writel(value, ioaddr + XGMAC_DMA_DPP_INT_STATUS);
> > +
> > +	if (value)
> > +		netdev_err(ndev, "Found DMA_DPP error, status: 0x%x\n", value);  
> 
> 1. Why not to implement this in the same way as the rest of the safety
> errors handle code? (with the flags described by the
> dwxgmac3_error_desc-based table and the respective counters being
> incremented should the errors were detected)
> 
XGMAC_DMA_DPP_INT_STATUS is just a bitmap of DMA RX and TX channels,
bottom 16 bits for 16 DMA TX channels and top 16 bits for 16 DMA RX channels.
No other descriptions.

And the counters should be updated, I will send a new patch.

> 2. I don't see this IRQ being enabled in the dwxgmac3_safety_feat_config()
> method. How come the respective event has turned to be triggered
> anyway?
This error report is enabled by default, and cannot be disabled or marked(as Synopsys Databook says).
What we can do is clearing it when it asserts.
> 
> -Serge(y) 
> 
> >  }
> >  
> >  static const struct dwxgmac3_error_desc dwxgmac3_mtl_errors[32]= {
> > -- 
> > 2.34.1
> > 
> >
Serge Semin Jan. 25, 2024, 7:07 p.m. UTC | #3
On Thu, Jan 25, 2024 at 10:56:20AM +0800, Furong Xu wrote:
> On Wed, 24 Jan 2024 17:36:10 +0300
> Serge Semin <fancer.lancer@gmail.com> wrote:
> 
> > On Tue, Jan 23, 2024 at 05:50:06PM +0800, Furong Xu wrote:
> > > Commit 56e58d6c8a56 ("net: stmmac: Implement Safety Features in
> > > XGMAC core") checks and reports safety errors, but leaves the
> > > Data Path Parity Errors for each channel in DMA unhandled at all, lead to
> > > a storm of interrupt.
> > > Fix it by checking and clearing the DMA_DPP_Interrupt_Status register.
> > > 
> > > Fixes: 56e58d6c8a56 ("net: stmmac: Implement Safety Features in XGMAC core")
> > > Signed-off-by: Furong Xu <0x1207@gmail.com>
> > > ---
> > >  drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h      | 1 +
> > >  drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c | 6 ++++++
> > >  2 files changed, 7 insertions(+)
> > > 
> > > diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h
> > > index 207ff1799f2c..188e11683136 100644
> > > --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h
> > > +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h
> > > @@ -385,6 +385,7 @@
> > >  #define XGMAC_DCEIE			BIT(1)
> > >  #define XGMAC_TCEIE			BIT(0)
> > >  #define XGMAC_DMA_ECC_INT_STATUS	0x0000306c
> > > +#define XGMAC_DMA_DPP_INT_STATUS	0x00003074
> > >  #define XGMAC_DMA_CH_CONTROL(x)		(0x00003100 + (0x80 * (x)))
> > >  #define XGMAC_SPH			BIT(24)
> > >  #define XGMAC_PBLx8			BIT(16)
> > > diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c
> > > index eb48211d9b0e..874e85b499e2 100644
> > > --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c
> > > +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c
> > > @@ -745,6 +745,12 @@ static void dwxgmac3_handle_mac_err(struct net_device *ndev,
> > >  
> > >  	dwxgmac3_log_error(ndev, value, correctable, "MAC",
> > >  			   dwxgmac3_mac_errors, STAT_OFF(mac_errors), stats);
> > > +
> > > +	value = readl(ioaddr + XGMAC_DMA_DPP_INT_STATUS);
> > > +	writel(value, ioaddr + XGMAC_DMA_DPP_INT_STATUS);
> > > +
> > > +	if (value)
> > > +		netdev_err(ndev, "Found DMA_DPP error, status: 0x%x\n", value);  
> > 
> > 1. Why not to implement this in the same way as the rest of the safety
> > errors handle code? (with the flags described by the
> > dwxgmac3_error_desc-based table and the respective counters being
> > incremented should the errors were detected)
> > 

> XGMAC_DMA_DPP_INT_STATUS is just a bitmap of DMA RX and TX channels,
> bottom 16 bits for 16 DMA TX channels and top 16 bits for 16 DMA RX channels.
> No other descriptions.
> 
> And the counters should be updated, I will send a new patch.

Ok. I'll wait for this patch v2 then with the counters fixed. Please
also note that you are adding the _DMA_ DPP events handling support.
Thus the more suitable place for this change would be
dwmac5_handle_dma_err().

> 
> > 2. I don't see this IRQ being enabled in the dwxgmac3_safety_feat_config()
> > method. How come the respective event has turned to be triggered
> > anyway?
> This error report is enabled by default, and cannot be disabled or marked(as Synopsys Databook says).
> What we can do is clearing it when it asserts.

This sounds so strange that I can barely believe in it. The DW QoS Eth
MTL DPP feature can be enabled/disabled, but the DW XGMAC DMA DPP
can't? This doesn't look logical. What's the point in having a never
maskable IRQ for not that much crucial but optional feature? Moreover
DPP adds some data flow overhead. If we are sure that no problem with
the device data paths, then it seems redundant to have it always
enabled. So I guess it must be switchable. Are you completely sure it
isn't?

-Serge(y)

> > 
> > -Serge(y) 
> > 
> > >  }
> > >  
> > >  static const struct dwxgmac3_error_desc dwxgmac3_mtl_errors[32]= {
> > > -- 
> > > 2.34.1
> > > 
> > >   
>
Furong Xu Jan. 26, 2024, 3:31 a.m. UTC | #4
On Thu, 25 Jan 2024 22:07:15 +0300
Serge Semin <fancer.lancer@gmail.com> wrote:

> On Thu, Jan 25, 2024 at 10:56:20AM +0800, Furong Xu wrote:
> > On Wed, 24 Jan 2024 17:36:10 +0300
> > Serge Semin <fancer.lancer@gmail.com> wrote:
> >   
> > > On Tue, Jan 23, 2024 at 05:50:06PM +0800, Furong Xu wrote:  
> > > > Commit 56e58d6c8a56 ("net: stmmac: Implement Safety Features in
> > > > XGMAC core") checks and reports safety errors, but leaves the
> > > > Data Path Parity Errors for each channel in DMA unhandled at all, lead to
> > > > a storm of interrupt.
> > > > Fix it by checking and clearing the DMA_DPP_Interrupt_Status register.
> > > > 
> > > > Fixes: 56e58d6c8a56 ("net: stmmac: Implement Safety Features in XGMAC core")
> > > > Signed-off-by: Furong Xu <0x1207@gmail.com>
> > > > ---
> > > >  drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h      | 1 +
> > > >  drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c | 6 ++++++
> > > >  2 files changed, 7 insertions(+)
> > > > 
> > > > diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h
> > > > index 207ff1799f2c..188e11683136 100644
> > > > --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h
> > > > +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h
> > > > @@ -385,6 +385,7 @@
> > > >  #define XGMAC_DCEIE			BIT(1)
> > > >  #define XGMAC_TCEIE			BIT(0)
> > > >  #define XGMAC_DMA_ECC_INT_STATUS	0x0000306c
> > > > +#define XGMAC_DMA_DPP_INT_STATUS	0x00003074
> > > >  #define XGMAC_DMA_CH_CONTROL(x)		(0x00003100 + (0x80 * (x)))
> > > >  #define XGMAC_SPH			BIT(24)
> > > >  #define XGMAC_PBLx8			BIT(16)
> > > > diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c
> > > > index eb48211d9b0e..874e85b499e2 100644
> > > > --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c
> > > > +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c
> > > > @@ -745,6 +745,12 @@ static void dwxgmac3_handle_mac_err(struct net_device *ndev,
> > > >  
> > > >  	dwxgmac3_log_error(ndev, value, correctable, "MAC",
> > > >  			   dwxgmac3_mac_errors, STAT_OFF(mac_errors), stats);
> > > > +
> > > > +	value = readl(ioaddr + XGMAC_DMA_DPP_INT_STATUS);
> > > > +	writel(value, ioaddr + XGMAC_DMA_DPP_INT_STATUS);
> > > > +
> > > > +	if (value)
> > > > +		netdev_err(ndev, "Found DMA_DPP error, status: 0x%x\n", value);    
> > > 
> > > 1. Why not to implement this in the same way as the rest of the safety
> > > errors handle code? (with the flags described by the
> > > dwxgmac3_error_desc-based table and the respective counters being
> > > incremented should the errors were detected)
> > >   
> 
> > XGMAC_DMA_DPP_INT_STATUS is just a bitmap of DMA RX and TX channels,
> > bottom 16 bits for 16 DMA TX channels and top 16 bits for 16 DMA RX channels.
> > No other descriptions.
> > 
> > And the counters should be updated, I will send a new patch.  
> 
> Ok. I'll wait for this patch v2 then with the counters fixed. Please
> also note that you are adding the _DMA_ DPP events handling support.
> Thus the more suitable place for this change would be
> dwmac5_handle_dma_err().
> 
> >   
> > > 2. I don't see this IRQ being enabled in the dwxgmac3_safety_feat_config()
> > > method. How come the respective event has turned to be triggered
> > > anyway?  
> > This error report is enabled by default, and cannot be disabled or marked(as Synopsys Databook says).
> > What we can do is clearing it when it asserts.  
> 
> This sounds so strange that I can barely believe in it. The DW QoS Eth
> MTL DPP feature can be enabled/disabled, but the DW XGMAC DMA DPP
> can't? This doesn't look logical. What's the point in having a never
> maskable IRQ for not that much crucial but optional feature? Moreover
> DPP adds some data flow overhead. If we are sure that no problem with
> the device data paths, then it seems redundant to have it always
> enabled. So I guess it must be switchable. Are you completely sure it
> isn't?

Sorry for my bad explanation.

Double checked DMA_DPP error report path on my device.

XGMAC DMA_DPP is enable by DDPP bit of MTL_DPP_Control.
DDPP bit is default to 0(Data path Parity Protection is enabled).
When DDPP bit is set to 1(Data path Parity Protection is disabled), no DMA_DPP interrupt is reported.

Once DMA_DPP interrupt is reported, there is no control bit to disable it or mask it.
DMA_DPP error is unrecoverable type, and unrecoverable error interrupt cannot be disabled or masked,
this is a design(as Synopsys Databook says).

A explicit ops on MTL_DPP_Control to clear DDPP bit can add to dwxgmac3_safety_feat_config
to make code looks better.

> 
> -Serge(y)
> 
> > > 
> > > -Serge(y) 
> > >   
> > > >  }
> > > >  
> > > >  static const struct dwxgmac3_error_desc dwxgmac3_mtl_errors[32]= {
> > > > -- 
> > > > 2.34.1
> > > > 
> > > >     
> >
diff mbox series

Patch

diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h
index 207ff1799f2c..188e11683136 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h
@@ -385,6 +385,7 @@ 
 #define XGMAC_DCEIE			BIT(1)
 #define XGMAC_TCEIE			BIT(0)
 #define XGMAC_DMA_ECC_INT_STATUS	0x0000306c
+#define XGMAC_DMA_DPP_INT_STATUS	0x00003074
 #define XGMAC_DMA_CH_CONTROL(x)		(0x00003100 + (0x80 * (x)))
 #define XGMAC_SPH			BIT(24)
 #define XGMAC_PBLx8			BIT(16)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c
index eb48211d9b0e..874e85b499e2 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c
@@ -745,6 +745,12 @@  static void dwxgmac3_handle_mac_err(struct net_device *ndev,
 
 	dwxgmac3_log_error(ndev, value, correctable, "MAC",
 			   dwxgmac3_mac_errors, STAT_OFF(mac_errors), stats);
+
+	value = readl(ioaddr + XGMAC_DMA_DPP_INT_STATUS);
+	writel(value, ioaddr + XGMAC_DMA_DPP_INT_STATUS);
+
+	if (value)
+		netdev_err(ndev, "Found DMA_DPP error, status: 0x%x\n", value);
 }
 
 static const struct dwxgmac3_error_desc dwxgmac3_mtl_errors[32]= {