Message ID | 20250116155350.555374-5-yoong.siang.song@intel.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | xsk: TX metadata Launch Time support | expand |
Hi Siang. On 16/1/2025 11:53 pm, Song Yoong Siang wrote: > Enable Launch Time Control (LTC) support to XDP zero copy via XDP Tx > metadata framework. > > This patch is tested with tools/testing/selftests/bpf/xdp_hw_metadata on > Intel I225-LM Ethernet controller. Below are the test steps and result. > > Test Steps: > 1. At DUT, start xdp_hw_metadata selftest application: > $ sudo ./xdp_hw_metadata enp2s0 -l 1000000000 -L 1 > > 2. At Link Partner, send an UDP packet with VLAN priority 1 to port 9091 of > DUT. > > When launch time is set to 1s in the future, the delta between launch time > and transmit hardware timestamp is equal to 0.016us, as shown in result > below: > 0x562ff5dc8880: rx_desc[4]->addr=84110 addr=84110 comp_addr=84110 EoP > rx_hash: 0xE343384 with RSS type:0x1 > HW RX-time: 1734578015467548904 (sec:1734578015.4675) delta to User RX-time sec:0.0002 (183.103 usec) > XDP RX-time: 1734578015467651698 (sec:1734578015.4677) delta to User RX-time sec:0.0001 (80.309 usec) > No rx_vlan_tci or rx_vlan_proto, err=-95 > 0x562ff5dc8880: ping-pong with csum=561c (want c7dd) csum_start=34 csum_offset=6 > HW RX-time: 1734578015467548904 (sec:1734578015.4675) delta to HW Launch-time sec:1.0000 (1000000.000 usec) > 0x562ff5dc8880: complete tx idx=4 addr=4018 > HW Launch-time: 1734578016467548904 (sec:1734578016.4675) delta to HW TX-complete-time sec:0.0000 (0.016 usec) > HW TX-complete-time: 1734578016467548920 (sec:1734578016.4675) delta to User TX-complete-time sec:0.0000 (32.546 usec) > XDP RX-time: 1734578015467651698 (sec:1734578015.4677) delta to User TX-complete-time sec:0.9999 (999929.768 usec) > HW RX-time: 1734578015467548904 (sec:1734578015.4675) delta to HW TX-complete-time sec:1.0000 (1000000.016 usec) > 0x562ff5dc8880: complete rx idx=132 addr=84110 To be cautious, could we perform a stress test by sending a higher number of packets with launch time? For example, we could send 200 packets, each configured with a launch time, and verify that the driver continues to function correctly afterward. > Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> > --- > drivers/net/ethernet/intel/igc/igc_main.c | 78 ++++++++++++++++------- > 1 file changed, 56 insertions(+), 22 deletions(-) > > diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c > index 27872bdea9bd..6857f5f5b4b2 100644 > --- a/drivers/net/ethernet/intel/igc/igc_main.c > +++ b/drivers/net/ethernet/intel/igc/igc_main.c > @@ -1566,6 +1566,26 @@ static bool igc_request_tx_tstamp(struct igc_adapter *adapter, struct sk_buff *s > return false; > } > > +static void igc_insert_empty_packet(struct igc_ring *tx_ring) > +{ > + struct igc_tx_buffer *empty_info; > + struct sk_buff *empty; > + void *data; > + > + empty_info = &tx_ring->tx_buffer_info[tx_ring->next_to_use]; > + empty = alloc_skb(IGC_EMPTY_FRAME_SIZE, GFP_ATOMIC); > + if (!empty) > + return; > + > + data = skb_put(empty, IGC_EMPTY_FRAME_SIZE); > + memset(data, 0, IGC_EMPTY_FRAME_SIZE); > + > + igc_tx_ctxtdesc(tx_ring, 0, false, 0, 0, 0); > + > + if (igc_init_tx_empty_descriptor(tx_ring, empty, empty_info) < 0) > + dev_kfree_skb_any(empty); > +} > + The function igc_insert_empty_packet() appears to wrap existing code to enhance reusability, with no new changes related to enabling launch-time XDP ZC functionality. If so, could we split this into a separate commit? This would make it clearer for the reader to distinguish between the refactoring changes and the new changes related to enabling launch-time XDP ZC support. > static netdev_tx_t igc_xmit_frame_ring(struct sk_buff *skb, > struct igc_ring *tx_ring) > { > @@ -1603,26 +1623,8 @@ static netdev_tx_t igc_xmit_frame_ring(struct sk_buff *skb, > skb->tstamp = ktime_set(0, 0); > launch_time = igc_tx_launchtime(tx_ring, txtime, &first_flag, &insert_empty); > > - if (insert_empty) { > - struct igc_tx_buffer *empty_info; > - struct sk_buff *empty; > - void *data; > - > - empty_info = &tx_ring->tx_buffer_info[tx_ring->next_to_use]; > - empty = alloc_skb(IGC_EMPTY_FRAME_SIZE, GFP_ATOMIC); > - if (!empty) > - goto done; > - > - data = skb_put(empty, IGC_EMPTY_FRAME_SIZE); > - memset(data, 0, IGC_EMPTY_FRAME_SIZE); > - > - igc_tx_ctxtdesc(tx_ring, 0, false, 0, 0, 0); > - > - if (igc_init_tx_empty_descriptor(tx_ring, > - empty, > - empty_info) < 0) > - dev_kfree_skb_any(empty); > - } > + if (insert_empty) > + igc_insert_empty_packet(tx_ring); > > done: > /* record the location of the first descriptor for this packet */ > @@ -2955,9 +2957,33 @@ static u64 igc_xsk_fill_timestamp(void *_priv) > return *(u64 *)_priv; > } > > +static void igc_xsk_request_launch_time(u64 launch_time, void *_priv) > +{ > + struct igc_metadata_request *meta_req = _priv; > + struct igc_ring *tx_ring = meta_req->tx_ring; > + __le32 launch_time_offset; > + bool insert_empty = false; > + bool first_flag = false; > + > + if (!tx_ring->launchtime_enable) > + return; > + > + launch_time_offset = igc_tx_launchtime(tx_ring, > + ns_to_ktime(launch_time), > + &first_flag, &insert_empty); > + if (insert_empty) { > + igc_insert_empty_packet(tx_ring); > + meta_req->tx_buffer = > + &tx_ring->tx_buffer_info[tx_ring->next_to_use]; > + } > + > + igc_tx_ctxtdesc(tx_ring, launch_time_offset, first_flag, 0, 0, 0); > +} > + > const struct xsk_tx_metadata_ops igc_xsk_tx_metadata_ops = { > .tmo_request_timestamp = igc_xsk_request_timestamp, > .tmo_fill_timestamp = igc_xsk_fill_timestamp, > + .tmo_request_launch_time = igc_xsk_request_launch_time, > }; > > static void igc_xdp_xmit_zc(struct igc_ring *ring) > @@ -2980,7 +3006,7 @@ static void igc_xdp_xmit_zc(struct igc_ring *ring) > ntu = ring->next_to_use; > budget = igc_desc_unused(ring); > > - while (xsk_tx_peek_desc(pool, &xdp_desc) && budget--) { > + while (xsk_tx_peek_desc(pool, &xdp_desc) && budget >= 4) { Could we add some explanation on what & why the value "4" is used ?
On 20/1/2025 2:25 pm, Abdul Rahim, Faizal wrote: > > To be cautious, could we perform a stress test by sending a higher number > of packets with launch time? For example, we could send 200 packets, each > configured with a launch time, and verify that the driver continues to > function correctly afterward. > I agree on this point. Could you perform the same stress test on the STMMAC driver as well?
On Monday, January 20, 2025 2:26 PM, Abdul Rahim, Faizal <faizal.abdul.rahim@linux.intel.com> wrote: >Hi Siang. > >On 16/1/2025 11:53 pm, Song Yoong Siang wrote: >> Enable Launch Time Control (LTC) support to XDP zero copy via XDP Tx >> metadata framework. >> >> This patch is tested with tools/testing/selftests/bpf/xdp_hw_metadata on >> Intel I225-LM Ethernet controller. Below are the test steps and result. >> >> Test Steps: >> 1. At DUT, start xdp_hw_metadata selftest application: >> $ sudo ./xdp_hw_metadata enp2s0 -l 1000000000 -L 1 >> >> 2. At Link Partner, send an UDP packet with VLAN priority 1 to port 9091 of >> DUT. >> >> When launch time is set to 1s in the future, the delta between launch time >> and transmit hardware timestamp is equal to 0.016us, as shown in result >> below: >> 0x562ff5dc8880: rx_desc[4]->addr=84110 addr=84110 comp_addr=84110 EoP >> rx_hash: 0xE343384 with RSS type:0x1 >> HW RX-time: 1734578015467548904 (sec:1734578015.4675) delta to User RX- >time sec:0.0002 (183.103 usec) >> XDP RX-time: 1734578015467651698 (sec:1734578015.4677) delta to User RX- >time sec:0.0001 (80.309 usec) >> No rx_vlan_tci or rx_vlan_proto, err=-95 >> 0x562ff5dc8880: ping-pong with csum=561c (want c7dd) csum_start=34 >csum_offset=6 >> HW RX-time: 1734578015467548904 (sec:1734578015.4675) delta to HW >Launch-time sec:1.0000 (1000000.000 usec) >> 0x562ff5dc8880: complete tx idx=4 addr=4018 >> HW Launch-time: 1734578016467548904 (sec:1734578016.4675) delta to HW >TX-complete-time sec:0.0000 (0.016 usec) >> HW TX-complete-time: 1734578016467548920 (sec:1734578016.4675) delta >to User TX-complete-time sec:0.0000 (32.546 usec) >> XDP RX-time: 1734578015467651698 (sec:1734578015.4677) delta to User TX- >complete-time sec:0.9999 (999929.768 usec) >> HW RX-time: 1734578015467548904 (sec:1734578015.4675) delta to HW TX- >complete-time sec:1.0000 (1000000.016 usec) >> 0x562ff5dc8880: complete rx idx=132 addr=84110 > >To be cautious, could we perform a stress test by sending a higher number >of packets with launch time? For example, we could send 200 packets, each >configured with a launch time, and verify that the driver continues to >function correctly afterward. > Hi Faizal, Thanks for your review comments. Sure, I can send continuous packets with short interval and share the result in commit msg. >> Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> >> --- >> drivers/net/ethernet/intel/igc/igc_main.c | 78 ++++++++++++++++------- >> 1 file changed, 56 insertions(+), 22 deletions(-) >> >> diff --git a/drivers/net/ethernet/intel/igc/igc_main.c >b/drivers/net/ethernet/intel/igc/igc_main.c >> index 27872bdea9bd..6857f5f5b4b2 100644 >> --- a/drivers/net/ethernet/intel/igc/igc_main.c >> +++ b/drivers/net/ethernet/intel/igc/igc_main.c >> @@ -1566,6 +1566,26 @@ static bool igc_request_tx_tstamp(struct igc_adapter >*adapter, struct sk_buff *s >> return false; >> } >> >> +static void igc_insert_empty_packet(struct igc_ring *tx_ring) >> +{ >> + struct igc_tx_buffer *empty_info; >> + struct sk_buff *empty; >> + void *data; >> + >> + empty_info = &tx_ring->tx_buffer_info[tx_ring->next_to_use]; >> + empty = alloc_skb(IGC_EMPTY_FRAME_SIZE, GFP_ATOMIC); >> + if (!empty) >> + return; >> + >> + data = skb_put(empty, IGC_EMPTY_FRAME_SIZE); >> + memset(data, 0, IGC_EMPTY_FRAME_SIZE); >> + >> + igc_tx_ctxtdesc(tx_ring, 0, false, 0, 0, 0); >> + >> + if (igc_init_tx_empty_descriptor(tx_ring, empty, empty_info) < 0) >> + dev_kfree_skb_any(empty); >> +} >> + > >The function igc_insert_empty_packet() appears to wrap existing code to >enhance reusability, with no new changes related to enabling launch-time >XDP ZC functionality. If so, could we split this into a separate commit? >This would make it clearer for the reader to distinguish between the >refactoring changes and the new changes related to enabling launch-time XDP >ZC support. > I am ok to split the patch into two. Will do it on next version submission. >> static netdev_tx_t igc_xmit_frame_ring(struct sk_buff *skb, >> struct igc_ring *tx_ring) >> { >> @@ -1603,26 +1623,8 @@ static netdev_tx_t igc_xmit_frame_ring(struct >sk_buff *skb, >> skb->tstamp = ktime_set(0, 0); >> launch_time = igc_tx_launchtime(tx_ring, txtime, &first_flag, >&insert_empty); >> >> - if (insert_empty) { >> - struct igc_tx_buffer *empty_info; >> - struct sk_buff *empty; >> - void *data; >> - >> - empty_info = &tx_ring->tx_buffer_info[tx_ring->next_to_use]; >> - empty = alloc_skb(IGC_EMPTY_FRAME_SIZE, GFP_ATOMIC); >> - if (!empty) >> - goto done; >> - >> - data = skb_put(empty, IGC_EMPTY_FRAME_SIZE); >> - memset(data, 0, IGC_EMPTY_FRAME_SIZE); >> - >> - igc_tx_ctxtdesc(tx_ring, 0, false, 0, 0, 0); >> - >> - if (igc_init_tx_empty_descriptor(tx_ring, >> - empty, >> - empty_info) < 0) >> - dev_kfree_skb_any(empty); >> - } >> + if (insert_empty) >> + igc_insert_empty_packet(tx_ring); >> >> done: >> /* record the location of the first descriptor for this packet */ >> @@ -2955,9 +2957,33 @@ static u64 igc_xsk_fill_timestamp(void *_priv) >> return *(u64 *)_priv; >> } >> >> +static void igc_xsk_request_launch_time(u64 launch_time, void *_priv) >> +{ >> + struct igc_metadata_request *meta_req = _priv; >> + struct igc_ring *tx_ring = meta_req->tx_ring; >> + __le32 launch_time_offset; >> + bool insert_empty = false; >> + bool first_flag = false; >> + >> + if (!tx_ring->launchtime_enable) >> + return; >> + >> + launch_time_offset = igc_tx_launchtime(tx_ring, >> + ns_to_ktime(launch_time), >> + &first_flag, &insert_empty); >> + if (insert_empty) { >> + igc_insert_empty_packet(tx_ring); >> + meta_req->tx_buffer = >> + &tx_ring->tx_buffer_info[tx_ring->next_to_use]; >> + } >> + >> + igc_tx_ctxtdesc(tx_ring, launch_time_offset, first_flag, 0, 0, 0); >> +} >> + >> const struct xsk_tx_metadata_ops igc_xsk_tx_metadata_ops = { >> .tmo_request_timestamp = igc_xsk_request_timestamp, >> .tmo_fill_timestamp = igc_xsk_fill_timestamp, >> + .tmo_request_launch_time = igc_xsk_request_launch_time, >> }; >> >> static void igc_xdp_xmit_zc(struct igc_ring *ring) >> @@ -2980,7 +3006,7 @@ static void igc_xdp_xmit_zc(struct igc_ring *ring) >> ntu = ring->next_to_use; >> budget = igc_desc_unused(ring); >> >> - while (xsk_tx_peek_desc(pool, &xdp_desc) && budget--) { >> + while (xsk_tx_peek_desc(pool, &xdp_desc) && budget >= 4) { > >Could we add some explanation on what & why the value "4" is used ? It is because packet with launch time needs 2 descriptors and same goes for the empty packets. Thus, total need 4 descriptors. I will add detail explanation. Thanks & Regards Siang
On Monday, January 20, 2025 3:25 PM, Choong Yong Liang <yong.liang.choong@linux.intel.com> wrote: >On 20/1/2025 2:25 pm, Abdul Rahim, Faizal wrote: >> >> To be cautious, could we perform a stress test by sending a higher number >> of packets with launch time? For example, we could send 200 packets, each >> configured with a launch time, and verify that the driver continues to >> function correctly afterward. >> >I agree on this point. Could you perform the same stress test on the STMMAC >driver as well? Hi Yong Liang, Sure. I will perform the same tests on stmmac and share the results. Thanks & Regards Siang
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index 27872bdea9bd..6857f5f5b4b2 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -1566,6 +1566,26 @@ static bool igc_request_tx_tstamp(struct igc_adapter *adapter, struct sk_buff *s return false; } +static void igc_insert_empty_packet(struct igc_ring *tx_ring) +{ + struct igc_tx_buffer *empty_info; + struct sk_buff *empty; + void *data; + + empty_info = &tx_ring->tx_buffer_info[tx_ring->next_to_use]; + empty = alloc_skb(IGC_EMPTY_FRAME_SIZE, GFP_ATOMIC); + if (!empty) + return; + + data = skb_put(empty, IGC_EMPTY_FRAME_SIZE); + memset(data, 0, IGC_EMPTY_FRAME_SIZE); + + igc_tx_ctxtdesc(tx_ring, 0, false, 0, 0, 0); + + if (igc_init_tx_empty_descriptor(tx_ring, empty, empty_info) < 0) + dev_kfree_skb_any(empty); +} + static netdev_tx_t igc_xmit_frame_ring(struct sk_buff *skb, struct igc_ring *tx_ring) { @@ -1603,26 +1623,8 @@ static netdev_tx_t igc_xmit_frame_ring(struct sk_buff *skb, skb->tstamp = ktime_set(0, 0); launch_time = igc_tx_launchtime(tx_ring, txtime, &first_flag, &insert_empty); - if (insert_empty) { - struct igc_tx_buffer *empty_info; - struct sk_buff *empty; - void *data; - - empty_info = &tx_ring->tx_buffer_info[tx_ring->next_to_use]; - empty = alloc_skb(IGC_EMPTY_FRAME_SIZE, GFP_ATOMIC); - if (!empty) - goto done; - - data = skb_put(empty, IGC_EMPTY_FRAME_SIZE); - memset(data, 0, IGC_EMPTY_FRAME_SIZE); - - igc_tx_ctxtdesc(tx_ring, 0, false, 0, 0, 0); - - if (igc_init_tx_empty_descriptor(tx_ring, - empty, - empty_info) < 0) - dev_kfree_skb_any(empty); - } + if (insert_empty) + igc_insert_empty_packet(tx_ring); done: /* record the location of the first descriptor for this packet */ @@ -2955,9 +2957,33 @@ static u64 igc_xsk_fill_timestamp(void *_priv) return *(u64 *)_priv; } +static void igc_xsk_request_launch_time(u64 launch_time, void *_priv) +{ + struct igc_metadata_request *meta_req = _priv; + struct igc_ring *tx_ring = meta_req->tx_ring; + __le32 launch_time_offset; + bool insert_empty = false; + bool first_flag = false; + + if (!tx_ring->launchtime_enable) + return; + + launch_time_offset = igc_tx_launchtime(tx_ring, + ns_to_ktime(launch_time), + &first_flag, &insert_empty); + if (insert_empty) { + igc_insert_empty_packet(tx_ring); + meta_req->tx_buffer = + &tx_ring->tx_buffer_info[tx_ring->next_to_use]; + } + + igc_tx_ctxtdesc(tx_ring, launch_time_offset, first_flag, 0, 0, 0); +} + const struct xsk_tx_metadata_ops igc_xsk_tx_metadata_ops = { .tmo_request_timestamp = igc_xsk_request_timestamp, .tmo_fill_timestamp = igc_xsk_fill_timestamp, + .tmo_request_launch_time = igc_xsk_request_launch_time, }; static void igc_xdp_xmit_zc(struct igc_ring *ring) @@ -2980,7 +3006,7 @@ static void igc_xdp_xmit_zc(struct igc_ring *ring) ntu = ring->next_to_use; budget = igc_desc_unused(ring); - while (xsk_tx_peek_desc(pool, &xdp_desc) && budget--) { + while (xsk_tx_peek_desc(pool, &xdp_desc) && budget >= 4) { struct igc_metadata_request meta_req; struct xsk_tx_metadata *meta = NULL; struct igc_tx_buffer *bi; @@ -3004,6 +3030,12 @@ static void igc_xdp_xmit_zc(struct igc_ring *ring) xsk_tx_metadata_request(meta, &igc_xsk_tx_metadata_ops, &meta_req); + /* xsk_tx_metadata_request() may have updated next_to_use */ + ntu = ring->next_to_use; + + /* xsk_tx_metadata_request() may have updated Tx buffer info */ + bi = meta_req.tx_buffer; + tx_desc = IGC_TX_DESC(ring, ntu); tx_desc->read.cmd_type_len = cpu_to_le32(meta_req.cmd_type); tx_desc->read.olinfo_status = cpu_to_le32(olinfo_status); @@ -3021,9 +3053,11 @@ static void igc_xdp_xmit_zc(struct igc_ring *ring) ntu++; if (ntu == ring->count) ntu = 0; + + ring->next_to_use = ntu; + budget = igc_desc_unused(ring); } - ring->next_to_use = ntu; if (tx_desc) { igc_flush_tx_descriptors(ring); xsk_tx_release(pool);
Enable Launch Time Control (LTC) support to XDP zero copy via XDP Tx metadata framework. This patch is tested with tools/testing/selftests/bpf/xdp_hw_metadata on Intel I225-LM Ethernet controller. Below are the test steps and result. Test Steps: 1. At DUT, start xdp_hw_metadata selftest application: $ sudo ./xdp_hw_metadata enp2s0 -l 1000000000 -L 1 2. At Link Partner, send an UDP packet with VLAN priority 1 to port 9091 of DUT. When launch time is set to 1s in the future, the delta between launch time and transmit hardware timestamp is equal to 0.016us, as shown in result below: 0x562ff5dc8880: rx_desc[4]->addr=84110 addr=84110 comp_addr=84110 EoP rx_hash: 0xE343384 with RSS type:0x1 HW RX-time: 1734578015467548904 (sec:1734578015.4675) delta to User RX-time sec:0.0002 (183.103 usec) XDP RX-time: 1734578015467651698 (sec:1734578015.4677) delta to User RX-time sec:0.0001 (80.309 usec) No rx_vlan_tci or rx_vlan_proto, err=-95 0x562ff5dc8880: ping-pong with csum=561c (want c7dd) csum_start=34 csum_offset=6 HW RX-time: 1734578015467548904 (sec:1734578015.4675) delta to HW Launch-time sec:1.0000 (1000000.000 usec) 0x562ff5dc8880: complete tx idx=4 addr=4018 HW Launch-time: 1734578016467548904 (sec:1734578016.4675) delta to HW TX-complete-time sec:0.0000 (0.016 usec) HW TX-complete-time: 1734578016467548920 (sec:1734578016.4675) delta to User TX-complete-time sec:0.0000 (32.546 usec) XDP RX-time: 1734578015467651698 (sec:1734578015.4677) delta to User TX-complete-time sec:0.9999 (999929.768 usec) HW RX-time: 1734578015467548904 (sec:1734578015.4675) delta to HW TX-complete-time sec:1.0000 (1000000.016 usec) 0x562ff5dc8880: complete rx idx=132 addr=84110 Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> --- drivers/net/ethernet/intel/igc/igc_main.c | 78 ++++++++++++++++------- 1 file changed, 56 insertions(+), 22 deletions(-)