diff mbox series

[net-next] net: skb: free up one bit in tx_flags

Message ID 20250221035938.2891898-1-willemdebruijn.kernel@gmail.com (mailing list archive)
State New
Delegated to: Netdev Maintainers
Headers show
Series [net-next] net: skb: free up one bit in tx_flags | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 42 this patch: 42
netdev/build_tools success Errors and warnings before: 26 (+1) this patch: 26 (+1)
netdev/cc_maintainers warning 6 maintainers not CCed: intel-wired-lan@lists.osuosl.org andrew+netdev@lunn.ch aleksander.lobakin@intel.com przemyslaw.kitszel@intel.com kuniyu@amazon.com horms@kernel.org
netdev/build_clang success Errors and warnings before: 82 this patch: 82
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 4974 this patch: 4974
netdev/checkpatch warning WARNING: line length of 87 exceeds 80 columns WARNING: line length of 94 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 103 this patch: 103
netdev/source_inline success Was 0 now: 0
netdev/contest fail net-next-2025-02-21--15-00 (tests: 893)

Commit Message

Willem de Bruijn Feb. 21, 2025, 3:58 a.m. UTC
From: Willem de Bruijn <willemb@google.com>

The linked series wants to add skb tx completion timestamps.
That needs a bit in skb_shared_info.tx_flags, but all are in use.

A per-skb bit is only needed for features that are configured on a
per packet basis. Per socket features can be read from sk->sk_tsflags.

Per packet tsflags can be set in sendmsg using cmsg, but only those in
SOF_TIMESTAMPING_TX_RECORD_MASK.

Per packet tsflags can also be set without cmsg by sandwiching a
send inbetween two setsockopts:

    val |= SOF_TIMESTAMPING_$FEATURE;
    setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, &val, sizeof(val));
    write(fd, buf, sz);
    val &= ~SOF_TIMESTAMPING_$FEATURE;
    setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, &val, sizeof(val));

Changing a datapath test from skb_shinfo(skb)->tx_flags to
skb->sk->sk_tsflags can change behavior in that case, as the tx_flags
is written before the second setsockopt updates sk_tsflags.

Therefore, only bits can be reclaimed that cannot be set by cmsg and
are also highly unlikely to be used to target individual packets
otherwise.

Free up the bit currently used for SKBTX_HW_TSTAMP_USE_CYCLES. This
selects between clock and free running counter source for HW TX
timestamps. It is probable that all packets of the same socket will
always use the same source.

Link: https://lore.kernel.org/netdev/cover.1739988644.git.pav@iki.fi/
Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 drivers/net/ethernet/engleder/tsnep_main.c |  4 ++--
 drivers/net/ethernet/intel/igc/igc_main.c  |  3 ++-
 include/linux/skbuff.h                     |  5 ++---
 net/socket.c                               | 11 +----------
 4 files changed, 7 insertions(+), 16 deletions(-)

Comments

Jason Xing Feb. 21, 2025, 6:12 a.m. UTC | #1
On Fri, Feb 21, 2025 at 11:59 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> From: Willem de Bruijn <willemb@google.com>
>
> The linked series wants to add skb tx completion timestamps.
> That needs a bit in skb_shared_info.tx_flags, but all are in use.
>
> A per-skb bit is only needed for features that are configured on a
> per packet basis. Per socket features can be read from sk->sk_tsflags.
>
> Per packet tsflags can be set in sendmsg using cmsg, but only those in
> SOF_TIMESTAMPING_TX_RECORD_MASK.
>
> Per packet tsflags can also be set without cmsg by sandwiching a
> send inbetween two setsockopts:
>
>     val |= SOF_TIMESTAMPING_$FEATURE;
>     setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, &val, sizeof(val));
>     write(fd, buf, sz);
>     val &= ~SOF_TIMESTAMPING_$FEATURE;
>     setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, &val, sizeof(val));
>
> Changing a datapath test from skb_shinfo(skb)->tx_flags to
> skb->sk->sk_tsflags can change behavior in that case, as the tx_flags
> is written before the second setsockopt updates sk_tsflags.
>
> Therefore, only bits can be reclaimed that cannot be set by cmsg and
> are also highly unlikely to be used to target individual packets
> otherwise.

Agreed.

>
> Free up the bit currently used for SKBTX_HW_TSTAMP_USE_CYCLES. This
> selects between clock and free running counter source for HW TX
> timestamps. It is probable that all packets of the same socket will
> always use the same source.
>
> Link: https://lore.kernel.org/netdev/cover.1739988644.git.pav@iki.fi/
> Signed-off-by: Willem de Bruijn <willemb@google.com>

Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>

Thanks, Willem!
Gerhard Engleder Feb. 21, 2025, 9:11 p.m. UTC | #2
On 21.02.25 04:58, Willem de Bruijn wrote:
> From: Willem de Bruijn <willemb@google.com>
> 
> The linked series wants to add skb tx completion timestamps.
> That needs a bit in skb_shared_info.tx_flags, but all are in use.
> 
> A per-skb bit is only needed for features that are configured on a
> per packet basis. Per socket features can be read from sk->sk_tsflags.
> 
> Per packet tsflags can be set in sendmsg using cmsg, but only those in
> SOF_TIMESTAMPING_TX_RECORD_MASK.
> 
> Per packet tsflags can also be set without cmsg by sandwiching a
> send inbetween two setsockopts:
> 
>      val |= SOF_TIMESTAMPING_$FEATURE;
>      setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, &val, sizeof(val));
>      write(fd, buf, sz);
>      val &= ~SOF_TIMESTAMPING_$FEATURE;
>      setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, &val, sizeof(val));
> 
> Changing a datapath test from skb_shinfo(skb)->tx_flags to
> skb->sk->sk_tsflags can change behavior in that case, as the tx_flags
> is written before the second setsockopt updates sk_tsflags.
> 
> Therefore, only bits can be reclaimed that cannot be set by cmsg and
> are also highly unlikely to be used to target individual packets
> otherwise.
> 
> Free up the bit currently used for SKBTX_HW_TSTAMP_USE_CYCLES. This
> selects between clock and free running counter source for HW TX
> timestamps. It is probable that all packets of the same socket will
> always use the same source.

For separate ptp4l instances with separate clocks this should be true.

> Link: https://lore.kernel.org/netdev/cover.1739988644.git.pav@iki.fi/
> Signed-off-by: Willem de Bruijn <willemb@google.com>
> ---
>   drivers/net/ethernet/engleder/tsnep_main.c |  4 ++--
>   drivers/net/ethernet/intel/igc/igc_main.c  |  3 ++-
>   include/linux/skbuff.h                     |  5 ++---
>   net/socket.c                               | 11 +----------
>   4 files changed, 7 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/net/ethernet/engleder/tsnep_main.c b/drivers/net/ethernet/engleder/tsnep_main.c
> index 0d030cb0b21c..3de4cb06e266 100644
> --- a/drivers/net/ethernet/engleder/tsnep_main.c
> +++ b/drivers/net/ethernet/engleder/tsnep_main.c
> @@ -852,8 +852,8 @@ static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget)
>   			struct skb_shared_hwtstamps hwtstamps;
>   			u64 timestamp;
>   
> -			if (skb_shinfo(entry->skb)->tx_flags &
> -			    SKBTX_HW_TSTAMP_USE_CYCLES)
> +			if (entry->skb->sk &&
> +			    READ_ONCE(entry->skb->sk->sk_tsflags) & SOF_TIMESTAMPING_BIND_PHC)
>   				timestamp =
>   					__le64_to_cpu(entry->desc_wb->counter);
>   			else
> diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
> index 84307bb7313e..0c4216a4552b 100644
> --- a/drivers/net/ethernet/intel/igc/igc_main.c
> +++ b/drivers/net/ethernet/intel/igc/igc_main.c
> @@ -1650,7 +1650,8 @@ static netdev_tx_t igc_xmit_frame_ring(struct sk_buff *skb,
>   		if (igc_request_tx_tstamp(adapter, skb, &tstamp_flags)) {
>   			skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
>   			tx_flags |= IGC_TX_FLAGS_TSTAMP | tstamp_flags;
> -			if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP_USE_CYCLES)
> +			if (skb->sk &&
> +			    READ_ONCE(skb->sk->sk_tsflags) & SOF_TIMESTAMPING_BIND_PHC)
>   				tx_flags |= IGC_TX_FLAGS_TSTAMP_TIMER_1;
>   		} else {
>   			adapter->tx_hwtstamp_skipped++;
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index bb2b751d274a..a65b2b08f994 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -478,8 +478,8 @@ enum {
>   	/* device driver is going to provide hardware time stamp */
>   	SKBTX_IN_PROGRESS = 1 << 2,
>   
> -	/* generate hardware time stamp based on cycles if supported */
> -	SKBTX_HW_TSTAMP_USE_CYCLES = 1 << 3,
> +	/* reserved */
> +	SKBTX_RESERVED = 1 << 3,
>   
>   	/* generate wifi status information (where possible) */
>   	SKBTX_WIFI_STATUS = 1 << 4,
> @@ -494,7 +494,6 @@ enum {
>   #define SKBTX_ANY_SW_TSTAMP	(SKBTX_SW_TSTAMP    | \
>   				 SKBTX_SCHED_TSTAMP)
>   #define SKBTX_ANY_TSTAMP	(SKBTX_HW_TSTAMP | \
> -				 SKBTX_HW_TSTAMP_USE_CYCLES | \
>   				 SKBTX_ANY_SW_TSTAMP)
>   
>   /* Definitions for flags in struct skb_shared_info */
> diff --git a/net/socket.c b/net/socket.c
> index 28bae5a94234..2e3e69710ea4 100644
> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -680,18 +680,9 @@ void __sock_tx_timestamp(__u32 tsflags, __u8 *tx_flags)
>   {
>   	u8 flags = *tx_flags;
>   
> -	if (tsflags & SOF_TIMESTAMPING_TX_HARDWARE) {
> +	if (tsflags & SOF_TIMESTAMPING_TX_HARDWARE)
>   		flags |= SKBTX_HW_TSTAMP;
>   
> -		/* PTP hardware clocks can provide a free running cycle counter
> -		 * as a time base for virtual clocks. Tell driver to use the
> -		 * free running cycle counter for timestamp if socket is bound
> -		 * to virtual clock.
> -		 */
> -		if (tsflags & SOF_TIMESTAMPING_BIND_PHC)
> -			flags |= SKBTX_HW_TSTAMP_USE_CYCLES;
> -	}
> -
>   	if (tsflags & SOF_TIMESTAMPING_TX_SOFTWARE)
>   		flags |= SKBTX_SW_TSTAMP;
>   

Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com>
diff mbox series

Patch

diff --git a/drivers/net/ethernet/engleder/tsnep_main.c b/drivers/net/ethernet/engleder/tsnep_main.c
index 0d030cb0b21c..3de4cb06e266 100644
--- a/drivers/net/ethernet/engleder/tsnep_main.c
+++ b/drivers/net/ethernet/engleder/tsnep_main.c
@@ -852,8 +852,8 @@  static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget)
 			struct skb_shared_hwtstamps hwtstamps;
 			u64 timestamp;
 
-			if (skb_shinfo(entry->skb)->tx_flags &
-			    SKBTX_HW_TSTAMP_USE_CYCLES)
+			if (entry->skb->sk &&
+			    READ_ONCE(entry->skb->sk->sk_tsflags) & SOF_TIMESTAMPING_BIND_PHC)
 				timestamp =
 					__le64_to_cpu(entry->desc_wb->counter);
 			else
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 84307bb7313e..0c4216a4552b 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -1650,7 +1650,8 @@  static netdev_tx_t igc_xmit_frame_ring(struct sk_buff *skb,
 		if (igc_request_tx_tstamp(adapter, skb, &tstamp_flags)) {
 			skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
 			tx_flags |= IGC_TX_FLAGS_TSTAMP | tstamp_flags;
-			if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP_USE_CYCLES)
+			if (skb->sk &&
+			    READ_ONCE(skb->sk->sk_tsflags) & SOF_TIMESTAMPING_BIND_PHC)
 				tx_flags |= IGC_TX_FLAGS_TSTAMP_TIMER_1;
 		} else {
 			adapter->tx_hwtstamp_skipped++;
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index bb2b751d274a..a65b2b08f994 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -478,8 +478,8 @@  enum {
 	/* device driver is going to provide hardware time stamp */
 	SKBTX_IN_PROGRESS = 1 << 2,
 
-	/* generate hardware time stamp based on cycles if supported */
-	SKBTX_HW_TSTAMP_USE_CYCLES = 1 << 3,
+	/* reserved */
+	SKBTX_RESERVED = 1 << 3,
 
 	/* generate wifi status information (where possible) */
 	SKBTX_WIFI_STATUS = 1 << 4,
@@ -494,7 +494,6 @@  enum {
 #define SKBTX_ANY_SW_TSTAMP	(SKBTX_SW_TSTAMP    | \
 				 SKBTX_SCHED_TSTAMP)
 #define SKBTX_ANY_TSTAMP	(SKBTX_HW_TSTAMP | \
-				 SKBTX_HW_TSTAMP_USE_CYCLES | \
 				 SKBTX_ANY_SW_TSTAMP)
 
 /* Definitions for flags in struct skb_shared_info */
diff --git a/net/socket.c b/net/socket.c
index 28bae5a94234..2e3e69710ea4 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -680,18 +680,9 @@  void __sock_tx_timestamp(__u32 tsflags, __u8 *tx_flags)
 {
 	u8 flags = *tx_flags;
 
-	if (tsflags & SOF_TIMESTAMPING_TX_HARDWARE) {
+	if (tsflags & SOF_TIMESTAMPING_TX_HARDWARE)
 		flags |= SKBTX_HW_TSTAMP;
 
-		/* PTP hardware clocks can provide a free running cycle counter
-		 * as a time base for virtual clocks. Tell driver to use the
-		 * free running cycle counter for timestamp if socket is bound
-		 * to virtual clock.
-		 */
-		if (tsflags & SOF_TIMESTAMPING_BIND_PHC)
-			flags |= SKBTX_HW_TSTAMP_USE_CYCLES;
-	}
-
 	if (tsflags & SOF_TIMESTAMPING_TX_SOFTWARE)
 		flags |= SKBTX_SW_TSTAMP;