diff mbox series

[net,6/6] igc: Fix inserting of empty frame for launchtime

Message ID 20230710163503.2821068-7-anthony.l.nguyen@intel.com (mailing list archive)
State Accepted
Commit 0bcc62858d6ba62cbade957d69745e6adeed5f3d
Delegated to: Netdev Maintainers
Headers show
Series igc: Fix corner cases for TSN offload | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1341 this patch: 1341
netdev/cc_maintainers success CCed 5 of 5 maintainers
netdev/build_clang success Errors and warnings before: 1364 this patch: 1364
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 1364 this patch: 1364
netdev/checkpatch warning WARNING: line length of 81 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Tony Nguyen July 10, 2023, 4:35 p.m. UTC
From: Florian Kauer <florian.kauer@linutronix.de>

The insertion of an empty frame was introduced with
commit db0b124f02ba ("igc: Enhance Qbv scheduling by using first flag bit")
in order to ensure that the current cycle has at least one packet if
there is some packet to be scheduled for the next cycle.

However, the current implementation does not properly check if
a packet is already scheduled for the current cycle. Currently,
an empty packet is always inserted if and only if
txtime >= end_of_cycle && txtime > last_tx_cycle
but since last_tx_cycle is always either the end of the current
cycle (end_of_cycle) or the end of a previous cycle, the
second part (txtime > last_tx_cycle) is always true unless
txtime == last_tx_cycle.

What actually needs to be checked here is if the last_tx_cycle
was already written within the current cycle, so an empty frame
should only be inserted if and only if
txtime >= end_of_cycle && end_of_cycle > last_tx_cycle.

This patch does not only avoid an unnecessary insertion, but it
can actually be harmful to insert an empty packet if packets
are already scheduled in the current cycle, because it can lead
to a situation where the empty packet is actually processed
as the first packet in the upcoming cycle shifting the packet
with the first_flag even one cycle into the future, finally leading
to a TX hang.

The TX hang can be reproduced on a i225 with:

    sudo tc qdisc replace dev enp1s0 parent root handle 100 taprio \
	    num_tc 1 \
	    map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \
	    queues 1@0 \
	    base-time 0 \
	    sched-entry S 01 300000 \
	    flags 0x1 \
	    txtime-delay 500000 \
	    clockid CLOCK_TAI
    sudo tc qdisc replace dev enp1s0 parent 100:1 etf \
	    clockid CLOCK_TAI \
	    delta 500000 \
	    offload \
	    skip_sock_check

and traffic generator

    sudo trafgen -i traffic.cfg -o enp1s0 --cpp -n0 -q -t1400ns

with traffic.cfg

    #define ETH_P_IP        0x0800

    {
      /* Ethernet Header */
      0x30, 0x1f, 0x9a, 0xd0, 0xf0, 0x0e,  # MAC Dest - adapt as needed
      0x24, 0x5e, 0xbe, 0x57, 0x2e, 0x36,  # MAC Src  - adapt as needed
      const16(ETH_P_IP),

      /* IPv4 Header */
      0b01000101, 0,   # IPv4 version, IHL, TOS
      const16(1028),   # IPv4 total length (UDP length + 20 bytes (IP header))
      const16(2),      # IPv4 ident
      0b01000000, 0,   # IPv4 flags, fragmentation off
      64,              # IPv4 TTL
      17,              # Protocol UDP
      csumip(14, 33),  # IPv4 checksum

      /* UDP Header */
      10,  0, 48, 1,   # IP Src - adapt as needed
      10,  0, 48, 10,  # IP Dest - adapt as needed
      const16(5555),   # UDP Src Port
      const16(6666),   # UDP Dest Port
      const16(1008),   # UDP length (UDP header 8 bytes + payload length)
      csumudp(14, 34), # UDP checksum

      /* Payload */
      fill('W', 1000),
    }

and the observed message with that is for example

 igc 0000:01:00.0 enp1s0: Detected Tx Unit Hang
   Tx Queue             <0>
   TDH                  <32>
   TDT                  <3c>
   next_to_use          <3c>
   next_to_clean        <32>
 buffer_info[next_to_clean]
   time_stamp           <ffff26a8>
   next_to_watch        <00000000632a1828>
   jiffies              <ffff27f8>
   desc.status          <1048000>

Fixes: db0b124f02ba ("igc: Enhance Qbv scheduling by using first flag bit")
Signed-off-by: Florian Kauer <florian.kauer@linutronix.de>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Leon Romanovsky July 11, 2023, 7:11 a.m. UTC | #1
On Mon, Jul 10, 2023 at 09:35:03AM -0700, Tony Nguyen wrote:
> From: Florian Kauer <florian.kauer@linutronix.de>
> 
> The insertion of an empty frame was introduced with
> commit db0b124f02ba ("igc: Enhance Qbv scheduling by using first flag bit")
> in order to ensure that the current cycle has at least one packet if
> there is some packet to be scheduled for the next cycle.
> 
> However, the current implementation does not properly check if
> a packet is already scheduled for the current cycle. Currently,
> an empty packet is always inserted if and only if
> txtime >= end_of_cycle && txtime > last_tx_cycle
> but since last_tx_cycle is always either the end of the current
> cycle (end_of_cycle) or the end of a previous cycle, the
> second part (txtime > last_tx_cycle) is always true unless
> txtime == last_tx_cycle.
> 
> What actually needs to be checked here is if the last_tx_cycle
> was already written within the current cycle, so an empty frame
> should only be inserted if and only if
> txtime >= end_of_cycle && end_of_cycle > last_tx_cycle.
> 
> This patch does not only avoid an unnecessary insertion, but it
> can actually be harmful to insert an empty packet if packets
> are already scheduled in the current cycle, because it can lead
> to a situation where the empty packet is actually processed
> as the first packet in the upcoming cycle shifting the packet
> with the first_flag even one cycle into the future, finally leading
> to a TX hang.
> 
> The TX hang can be reproduced on a i225 with:
> 
>     sudo tc qdisc replace dev enp1s0 parent root handle 100 taprio \
> 	    num_tc 1 \
> 	    map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \
> 	    queues 1@0 \
> 	    base-time 0 \
> 	    sched-entry S 01 300000 \
> 	    flags 0x1 \
> 	    txtime-delay 500000 \
> 	    clockid CLOCK_TAI
>     sudo tc qdisc replace dev enp1s0 parent 100:1 etf \
> 	    clockid CLOCK_TAI \
> 	    delta 500000 \
> 	    offload \
> 	    skip_sock_check
> 
> and traffic generator
> 
>     sudo trafgen -i traffic.cfg -o enp1s0 --cpp -n0 -q -t1400ns
> 
> with traffic.cfg
> 
>     #define ETH_P_IP        0x0800
> 
>     {
>       /* Ethernet Header */
>       0x30, 0x1f, 0x9a, 0xd0, 0xf0, 0x0e,  # MAC Dest - adapt as needed
>       0x24, 0x5e, 0xbe, 0x57, 0x2e, 0x36,  # MAC Src  - adapt as needed
>       const16(ETH_P_IP),
> 
>       /* IPv4 Header */
>       0b01000101, 0,   # IPv4 version, IHL, TOS
>       const16(1028),   # IPv4 total length (UDP length + 20 bytes (IP header))
>       const16(2),      # IPv4 ident
>       0b01000000, 0,   # IPv4 flags, fragmentation off
>       64,              # IPv4 TTL
>       17,              # Protocol UDP
>       csumip(14, 33),  # IPv4 checksum
> 
>       /* UDP Header */
>       10,  0, 48, 1,   # IP Src - adapt as needed
>       10,  0, 48, 10,  # IP Dest - adapt as needed
>       const16(5555),   # UDP Src Port
>       const16(6666),   # UDP Dest Port
>       const16(1008),   # UDP length (UDP header 8 bytes + payload length)
>       csumudp(14, 34), # UDP checksum
> 
>       /* Payload */
>       fill('W', 1000),
>     }
> 
> and the observed message with that is for example
> 
>  igc 0000:01:00.0 enp1s0: Detected Tx Unit Hang
>    Tx Queue             <0>
>    TDH                  <32>
>    TDT                  <3c>
>    next_to_use          <3c>
>    next_to_clean        <32>
>  buffer_info[next_to_clean]
>    time_stamp           <ffff26a8>
>    next_to_watch        <00000000632a1828>
>    jiffies              <ffff27f8>
>    desc.status          <1048000>
> 
> Fixes: db0b124f02ba ("igc: Enhance Qbv scheduling by using first flag bit")
> Signed-off-by: Florian Kauer <florian.kauer@linutronix.de>
> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
> Tested-by: Naama Meir <naamax.meir@linux.intel.com>
> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
> ---
>  drivers/net/ethernet/intel/igc/igc_main.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 

Thanks,
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 4855caa3bae4..9f93f0f4f752 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -1029,7 +1029,7 @@  static __le32 igc_tx_launchtime(struct igc_ring *ring, ktime_t txtime,
 			*first_flag = true;
 			ring->last_ff_cycle = baset_est;
 
-			if (ktime_compare(txtime, ring->last_tx_cycle) > 0)
+			if (ktime_compare(end_of_cycle, ring->last_tx_cycle) > 0)
 				*insert_empty = true;
 		}
 	}