Message ID | 20240809144229.1370-1-ende.tan@starfivetech.com (mailing list archive) |
---|---|
State | Superseded, archived |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net,1/1] net: stmmac: Set OWN bit last in dwmac4_set_rx_owner() | expand |
On Fri, Aug 09, 2024 at 10:42:29PM +0800, Tan En De wrote: > Ensure that all other bits in the RDES3 descriptor are configured before > transferring ownership of the descriptor to DMA via the OWN bit. Are you seeing things going wrong with real hardware, or is this just code review? If this is a real problem, please add a description of what the user would see. Does this need to be backported in stable? https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html If it does, you should add a Fixes: tag and 'Cc: stable@vger.kernel.org' This will also decide which tree you need to base the patch on: https://www.kernel.org/doc/html/latest/process/maintainer-netdev.html#netdev-faq > static void dwmac4_set_rx_owner(struct dma_desc *p, int disable_rx_ic) > { > - p->des3 |= cpu_to_le32(RDES3_OWN | RDES3_BUFFER1_VALID_ADDR); > + p->des3 |= cpu_to_le32(RDES3_BUFFER1_VALID_ADDR); > > if (!disable_rx_ic) > p->des3 |= cpu_to_le32(RDES3_INT_ON_COMPLETION_EN); > + > + dma_wmb(); > + p->des3 |= cpu_to_le32(RDES3_OWN); Is the problem here that RDES3_INT_ON_COMPLETION_EN is added after the RDES3_OWN above has hit the hardware, so it gets ignored? It seems like it would be better to calculate the value in a local variable, and then assign to p->des3 once. Andrew
Hi Andrew, thanks for taking a look at my patch. > Are you seeing things going wrong with real hardware, or is this just code > review? If this is a real problem, please add a description of what the user > would see. > > Does this need to be backported in stable? On my FPGA, after running iperf3 for a while, the GMAC somehow got stuck, as shown by 0.00 bits/sec, for example: ``` iperf3 -t 6000 -c 192.168.xxx.xxx -i 10 -P 2 -l 128 ... [ 5] 220.00-230.00 sec 2.04 MBytes 1.71 Mbits/sec 3 1.41 KBytes [ 7] 220.00-230.00 sec 2.04 MBytes 1.71 Mbits/sec 3 1.41 KBytes [SUM] 220.00-230.00 sec 4.07 MBytes 3.42 Mbits/sec 6 - - - - - - - - - - - - - - - - - - - - - - - - - [ 5] 230.00-240.01 sec 0.00 Bytes 0.00 bits/sec 2 1.41 KBytes [ 7] 230.00-240.01 sec 0.00 Bytes 0.00 bits/sec 2 1.41 KBytes [SUM] 230.00-240.01 sec 0.00 Bytes 0.00 bits/sec 4 ``` Used devmem to check registers: 0x780 (Rx_Packets_Count_Good_Bad) - The count was incrementing, so packets were still received. 0x114C (DMA_CH0_Current_App_RxDesc) - Value was changing, so DMA did update the RX descriptor address pointer. 0x1160 (DMA_CH0_Status). - Receive Buffer Unavailable RBU = 0. - Receive Interrupt RI = 1 (stuck at 1). which led me to suspect that there was missed RX interrupt. I then came across dwmac4_set_rx_owner() function, and saw the possibility of missed interrupt (when DMA sees OWN bit before INT_ON_COMPLETION bit). However, even with this patch, the problem persists on my FPGA. Therefore, I'd treat this patch as a code review, as I can't provide a concrete proof that this patch fixes any real hardware. > Is the problem here that RDES3_INT_ON_COMPLETION_EN is added after the > RDES3_OWN above has hit the hardware, so it gets ignored? > > It seems like it would be better to calculate the value in a local variable, and > then assign to p->des3 once. I didn't use local variable because I worry about CPU out-of-order execution. For example, ``` local_var = (INT_ON_COMPLETION | OWN) des3 |= local_var ``` CPU optimization might result in this ``` des3 |= INT_ON_COMPLETION des3 |= OWN ``` or worst, out of order like this ``` des3 |= OWN des3 |= INT_ON_COMPLETION ``` which could cause missing interrupt. Regards, Tan En De
On Sat, Aug 10, 2024 at 02:58:50AM +0000, EnDe Tan wrote: > Hi Andrew, thanks for taking a look at my patch. > > > Are you seeing things going wrong with real hardware, or is this just code > > review? If this is a real problem, please add a description of what the user > > would see. > > > > Does this need to be backported in stable? > > On my FPGA, after running iperf3 for a while, the GMAC somehow got stuck, > as shown by 0.00 bits/sec, for example: > ``` > iperf3 -t 6000 -c 192.168.xxx.xxx -i 10 -P 2 -l 128 > ... > [ 5] 220.00-230.00 sec 2.04 MBytes 1.71 Mbits/sec 3 1.41 KBytes > [ 7] 220.00-230.00 sec 2.04 MBytes 1.71 Mbits/sec 3 1.41 KBytes > [SUM] 220.00-230.00 sec 4.07 MBytes 3.42 Mbits/sec 6 > - - - - - - - - - - - - - - - - - - - - - - - - - > [ 5] 230.00-240.01 sec 0.00 Bytes 0.00 bits/sec 2 1.41 KBytes > [ 7] 230.00-240.01 sec 0.00 Bytes 0.00 bits/sec 2 1.41 KBytes > [SUM] 230.00-240.01 sec 0.00 Bytes 0.00 bits/sec 4 > ``` > > Used devmem to check registers: > 0x780 (Rx_Packets_Count_Good_Bad) > - The count was incrementing, so packets were still received. > 0x114C (DMA_CH0_Current_App_RxDesc) > - Value was changing, so DMA did update the RX descriptor address pointer. > 0x1160 (DMA_CH0_Status). > - Receive Buffer Unavailable RBU = 0. > - Receive Interrupt RI = 1 (stuck at 1). > > which led me to suspect that there was missed RX interrupt. > I then came across dwmac4_set_rx_owner() function, and saw the possibility of > missed interrupt (when DMA sees OWN bit before INT_ON_COMPLETION bit). > > However, even with this patch, the problem persists on my FPGA. > Therefore, I'd treat this patch as a code review, as I can't provide a concrete proof > that this patch fixes any real hardware. O.K. Please target this patch to net-next. We can always get it backported to stable later. > > Is the problem here that RDES3_INT_ON_COMPLETION_EN is added after the > > RDES3_OWN above has hit the hardware, so it gets ignored? > > > > It seems like it would be better to calculate the value in a local variable, and > > then assign to p->des3 once. > > I didn't use local variable because I worry about CPU out-of-order execution. > For example, > ``` > local_var = (INT_ON_COMPLETION | OWN) > des3 |= local_var > ``` > CPU optimization might result in this > ``` > des3 |= INT_ON_COMPLETION > des3 |= OWN > ``` > or worst, out of order like this > ``` > des3 |= OWN > des3 |= INT_ON_COMPLETION > ``` > which could cause missing interrupt. I'm assuming the des3 is mapped as non-cached. So each access is expensive. If you can convince the compiler to assemble the value in a register and then perform one write, it will be cheaper than two writes. Andrew
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c index 1c5802e0d7f4..95aea6ad485b 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c @@ -186,10 +186,13 @@ static void dwmac4_set_tx_owner(struct dma_desc *p) static void dwmac4_set_rx_owner(struct dma_desc *p, int disable_rx_ic) { - p->des3 |= cpu_to_le32(RDES3_OWN | RDES3_BUFFER1_VALID_ADDR); + p->des3 |= cpu_to_le32(RDES3_BUFFER1_VALID_ADDR); if (!disable_rx_ic) p->des3 |= cpu_to_le32(RDES3_INT_ON_COMPLETION_EN); + + dma_wmb(); + p->des3 |= cpu_to_le32(RDES3_OWN); } static int dwmac4_get_tx_ls(struct dma_desc *p)
Ensure that all other bits in the RDES3 descriptor are configured before transferring ownership of the descriptor to DMA via the OWN bit. Signed-off-by: Tan En De <ende.tan@starfivetech.com> --- drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)