diff mbox series

[RFC] net_tstamp: add SCM_TS_OPT_ID to provide OPT_ID in control message

Message ID 20240829000355.1172094-1-vadfed@meta.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [RFC] net_tstamp: add SCM_TS_OPT_ID to provide OPT_ID in control message | expand

Checks

Context Check Description
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next, async
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 91 this patch: 91
netdev/build_tools success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers warning 3 maintainers not CCed: edumazet@google.com arnd@arndb.de linux-arch@vger.kernel.org
netdev/build_clang success Errors and warnings before: 1086 this patch: 1086
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 15049 this patch: 15049
netdev/checkpatch fail ERROR: code indent should use tabs where possible WARNING: please, no spaces at the start of a line
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 18 this patch: 18
netdev/source_inline success Was 0 now: 0

Commit Message

Vadim Fedorenko Aug. 29, 2024, 12:03 a.m. UTC
SOF_TIMESTAMPING_OPT_ID socket option flag gives a way to correlate TX
timestamps and packets sent via socket. Unfortunately, there is no way
to reliably predict socket timestamp ID value in case of error returned
by sendmsg [1]. This patch adds new control message type to give user-space
software an opportunity to control the mapping between packets and
values by providing ID with each sendmsg. This works fine for UDP
sockets only, and explicit check is added to control message parser.
Also, there is no easy way to use 0 as provided ID, so this is value
treated as invalid.

[1] https://lore.kernel.org/netdev/CALCETrU0jB+kg0mhV6A8mrHfTE1D1pr1SD_B9Eaa9aDPfgHdtA@mail.gmail.com/

Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
---
 include/net/inet_sock.h           |  1 +
 include/net/sock.h                |  1 +
 include/uapi/asm-generic/socket.h |  2 ++
 net/core/sock.c                   | 14 ++++++++++++++
 net/ipv4/ip_output.c              | 11 +++++++++--
 net/ipv6/ip6_output.c             | 11 +++++++++--
 6 files changed, 36 insertions(+), 4 deletions(-)

Comments

Willem de Bruijn Aug. 29, 2024, 1:31 p.m. UTC | #1
Vadim Fedorenko wrote:
> SOF_TIMESTAMPING_OPT_ID socket option flag gives a way to correlate TX
> timestamps

+1 on the feature. Few minor points only.

Not a hard requirement, but would be nice if there was a test,
e.g., as a tools/testing/../txtimestamp.c extension.

> and packets sent via socket. Unfortunately, there is no way
> to reliably predict socket timestamp ID value in case of error returned
> by sendmsg [1].

Might be good to copy more context from the discussion to explain why
reliable OPT_ID is infeasible. For UDP, it is as simple as lockless
transmit. For RAW, things like MSG_MORE come into play.

> This patch adds new control message type to give user-space
> software an opportunity to control the mapping between packets and
> values by providing ID with each sendmsg. This works fine for UDP
> sockets only, and explicit check is added to control message parser.
> Also, there is no easy way to use 0 as provided ID, so this is value
> treated as invalid.

This is because the code branches on non-zero value in the cookie,
else uses ts_key. Please make this explicit. Or perhaps better, add a
bit in the cookie so that the full 32-bit space can be used.

> [1] https://lore.kernel.org/netdev/CALCETrU0jB+kg0mhV6A8mrHfTE1D1pr1SD_B9Eaa9aDPfgHdtA@mail.gmail.com/
> 
> Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
> ---
>  include/net/inet_sock.h           |  1 +
>  include/net/sock.h                |  1 +
>  include/uapi/asm-generic/socket.h |  2 ++
>  net/core/sock.c                   | 14 ++++++++++++++
>  net/ipv4/ip_output.c              | 11 +++++++++--
>  net/ipv6/ip6_output.c             | 11 +++++++++--
>  6 files changed, 36 insertions(+), 4 deletions(-)
> 
> diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
> index 394c3b66065e..7e8545311557 100644
> --- a/include/net/inet_sock.h
> +++ b/include/net/inet_sock.h
> @@ -174,6 +174,7 @@ struct inet_cork {
>  	__s16			tos;
>  	char			priority;
>  	__u16			gso_size;
> +	u32			ts_opt_id;
>  	u64			transmit_time;
>  	u32			mark;
>  };

Ah there's a hole here. Nice!

> diff --git a/include/net/sock.h b/include/net/sock.h
> index f51d61fab059..73e21dad5660 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -1794,6 +1794,7 @@ struct sockcm_cookie {
>  	u64 transmit_time;
>  	u32 mark;
>  	u32 tsflags;
> +	u32 ts_opt_id;
>  };
>  
>  static inline void sockcm_init(struct sockcm_cookie *sockc,
> diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
> index 8ce8a39a1e5f..db3df3e74b01 100644
> --- a/include/uapi/asm-generic/socket.h
> +++ b/include/uapi/asm-generic/socket.h
> @@ -135,6 +135,8 @@
>  #define SO_PASSPIDFD		76
>  #define SO_PEERPIDFD		77
>  
> +#define SCM_TS_OPT_ID		78
> +
>  #if !defined(__KERNEL__)
>  
>  #if __BITS_PER_LONG == 64 || (defined(__x86_64__) && defined(__ILP32__))
> diff --git a/net/core/sock.c b/net/core/sock.c
> index 468b1239606c..918cb6a0dcba 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -2859,6 +2859,20 @@ int __sock_cmsg_send(struct sock *sk, struct cmsghdr *cmsg,
>  			return -EINVAL;
>  		sockc->transmit_time = get_unaligned((u64 *)CMSG_DATA(cmsg));
>  		break;
> +	case SCM_TS_OPT_ID:
> +		/* allow this option for UDP sockets only */
> +		if (!sk_is_udp(sk))
> +			return -EINVAL;
> +		tsflags = READ_ONCE(sk->sk_tsflags);
> +		if (!(tsflags & SOF_TIMESTAMPING_OPT_ID))
> +			return -EINVAL;
> +		if (cmsg->cmsg_len != CMSG_LEN(sizeof(u32)))
> +			return -EINVAL;
> +		sockc->ts_opt_id = get_unaligned((u32 *)CMSG_DATA(cmsg));

Is the get_unaligned here needed? I don't usually see that on
CMSG_DATA accesses. Even though they are indeed likely to be
unaligned.

> +		/* do not allow 0 as packet id for timestamp */
> +		if (!sockc->ts_opt_id)
> +			return -EINVAL;
> +		break;
>  	/* SCM_RIGHTS and SCM_CREDENTIALS are semantically in SOL_UNIX. */
>  	case SCM_RIGHTS:
>  	case SCM_CREDENTIALS:
> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
> index b90d0f78ac80..f1e6695cafd2 100644
> --- a/net/ipv4/ip_output.c
> +++ b/net/ipv4/ip_output.c
> @@ -1050,8 +1050,14 @@ static int __ip_append_data(struct sock *sk,
>  
>  	hold_tskey = cork->tx_flags & SKBTX_ANY_TSTAMP &&
>  		     READ_ONCE(sk->sk_tsflags) & SOF_TIMESTAMPING_OPT_ID;
> -	if (hold_tskey)
> -		tskey = atomic_inc_return(&sk->sk_tskey) - 1;
> +	if (hold_tskey) {
> +                if (cork->ts_opt_id) {
> +                        hold_tskey = false;
> +                        tskey = cork->ts_opt_id;
> +                } else {
> +                        tskey = atomic_inc_return(&sk->sk_tskey) - 1;
> +                }
> +	}
>  
>  	/* So, what's going on in the loop below?
>  	 *
> @@ -1324,6 +1330,7 @@ static int ip_setup_cork(struct sock *sk, struct inet_cork *cork,
>  	cork->mark = ipc->sockc.mark;
>  	cork->priority = ipc->priority;
>  	cork->transmit_time = ipc->sockc.transmit_time;
> +	cork->ts_opt_id = ipc->sockc.ts_opt_id;
>  	cork->tx_flags = 0;
>  	sock_tx_timestamp(sk, ipc->sockc.tsflags, &cork->tx_flags);
>  
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index f26841f1490f..602064250546 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -1401,6 +1401,7 @@ static int ip6_setup_cork(struct sock *sk, struct inet_cork_full *cork,
>  	cork->base.gso_size = ipc6->gso_size;
>  	cork->base.tx_flags = 0;
>  	cork->base.mark = ipc6->sockc.mark;
> +	cork->base.ts_opt_id = ipc6->sockc.ts_opt_id;
>  	sock_tx_timestamp(sk, ipc6->sockc.tsflags, &cork->base.tx_flags);
>  
>  	cork->base.length = 0;
> @@ -1545,8 +1546,14 @@ static int __ip6_append_data(struct sock *sk,
>  
>  	hold_tskey = cork->tx_flags & SKBTX_ANY_TSTAMP &&
>  		     READ_ONCE(sk->sk_tsflags) & SOF_TIMESTAMPING_OPT_ID;
> -	if (hold_tskey)
> -		tskey = atomic_inc_return(&sk->sk_tskey) - 1;
> +	if (hold_tskey) {
> +		if (cork->ts_opt_id) {
> +			hold_tskey = false;
> +			tskey = cork->ts_opt_id;
> +		} else {
> +			tskey = atomic_inc_return(&sk->sk_tskey) - 1;
> +		}
> +	}
>  
>  	/*
>  	 * Let's try using as much space as possible.
> -- 
> 2.43.5
>
Vadim Fedorenko Aug. 29, 2024, 2:13 p.m. UTC | #2
On 29/08/2024 14:31, Willem de Bruijn wrote:
> Vadim Fedorenko wrote:
>> SOF_TIMESTAMPING_OPT_ID socket option flag gives a way to correlate TX
>> timestamps
> 
> +1 on the feature. Few minor points only.
> 
> Not a hard requirement, but would be nice if there was a test,
> e.g., as a tools/testing/../txtimestamp.c extension.

Sure, I'll add some tests in the next version.


>> and packets sent via socket. Unfortunately, there is no way
>> to reliably predict socket timestamp ID value in case of error returned
>> by sendmsg [1].
> 
> Might be good to copy more context from the discussion to explain why
> reliable OPT_ID is infeasible. For UDP, it is as simple as lockless
> transmit. For RAW, things like MSG_MORE come into play.

Ok, I'll add it, thanks!

>> This patch adds new control message type to give user-space
>> software an opportunity to control the mapping between packets and
>> values by providing ID with each sendmsg. This works fine for UDP
>> sockets only, and explicit check is added to control message parser.
>> Also, there is no easy way to use 0 as provided ID, so this is value
>> treated as invalid.
> 
> This is because the code branches on non-zero value in the cookie,
> else uses ts_key. Please make this explicit. Or perhaps better, add a
> bit in the cookie so that the full 32-bit space can be used.

Adding a bit in the cookie is not enough, I have to add another flag to
inet_cork. And we are running out of space for tx flags, 
inet_cork::tx_flags is u8 and we have only 1 bit left for SKBTX* enum.
Do you think it's OK to use this last bit for OPT_ID feature?

>> [1] https://lore.kernel.org/netdev/CALCETrU0jB+kg0mhV6A8mrHfTE1D1pr1SD_B9Eaa9aDPfgHdtA@mail.gmail.com/
>>
>> Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
>> ---
>>   include/net/inet_sock.h           |  1 +
>>   include/net/sock.h                |  1 +
>>   include/uapi/asm-generic/socket.h |  2 ++
>>   net/core/sock.c                   | 14 ++++++++++++++
>>   net/ipv4/ip_output.c              | 11 +++++++++--
>>   net/ipv6/ip6_output.c             | 11 +++++++++--
>>   6 files changed, 36 insertions(+), 4 deletions(-)
>>
>> diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
>> index 394c3b66065e..7e8545311557 100644
>> --- a/include/net/inet_sock.h
>> +++ b/include/net/inet_sock.h
>> @@ -174,6 +174,7 @@ struct inet_cork {
>>   	__s16			tos;
>>   	char			priority;
>>   	__u16			gso_size;
>> +	u32			ts_opt_id;
>>   	u64			transmit_time;
>>   	u32			mark;
>>   };
> 
> Ah there's a hole here. Nice!
> 
>> diff --git a/include/net/sock.h b/include/net/sock.h
>> index f51d61fab059..73e21dad5660 100644
>> --- a/include/net/sock.h
>> +++ b/include/net/sock.h
>> @@ -1794,6 +1794,7 @@ struct sockcm_cookie {
>>   	u64 transmit_time;
>>   	u32 mark;
>>   	u32 tsflags;
>> +	u32 ts_opt_id;
>>   };
>>   
>>   static inline void sockcm_init(struct sockcm_cookie *sockc,
>> diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
>> index 8ce8a39a1e5f..db3df3e74b01 100644
>> --- a/include/uapi/asm-generic/socket.h
>> +++ b/include/uapi/asm-generic/socket.h
>> @@ -135,6 +135,8 @@
>>   #define SO_PASSPIDFD		76
>>   #define SO_PEERPIDFD		77
>>   
>> +#define SCM_TS_OPT_ID		78
>> +
>>   #if !defined(__KERNEL__)
>>   
>>   #if __BITS_PER_LONG == 64 || (defined(__x86_64__) && defined(__ILP32__))
>> diff --git a/net/core/sock.c b/net/core/sock.c
>> index 468b1239606c..918cb6a0dcba 100644
>> --- a/net/core/sock.c
>> +++ b/net/core/sock.c
>> @@ -2859,6 +2859,20 @@ int __sock_cmsg_send(struct sock *sk, struct cmsghdr *cmsg,
>>   			return -EINVAL;
>>   		sockc->transmit_time = get_unaligned((u64 *)CMSG_DATA(cmsg));
>>   		break;
>> +	case SCM_TS_OPT_ID:
>> +		/* allow this option for UDP sockets only */
>> +		if (!sk_is_udp(sk))
>> +			return -EINVAL;
>> +		tsflags = READ_ONCE(sk->sk_tsflags);
>> +		if (!(tsflags & SOF_TIMESTAMPING_OPT_ID))
>> +			return -EINVAL;
>> +		if (cmsg->cmsg_len != CMSG_LEN(sizeof(u32)))
>> +			return -EINVAL;
>> +		sockc->ts_opt_id = get_unaligned((u32 *)CMSG_DATA(cmsg));
> 
> Is the get_unaligned here needed? I don't usually see that on
> CMSG_DATA accesses. Even though they are indeed likely to be
> unaligned.

Well, maybe you are right and we don't need get_unaligned for u32
here, at least SO_MARK uses direct access. I have no strong opinion.

>> +		/* do not allow 0 as packet id for timestamp */
>> +		if (!sockc->ts_opt_id)
>> +			return -EINVAL;
>> +		break;
>>   	/* SCM_RIGHTS and SCM_CREDENTIALS are semantically in SOL_UNIX. */
>>   	case SCM_RIGHTS:
>>   	case SCM_CREDENTIALS:
>> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
>> index b90d0f78ac80..f1e6695cafd2 100644
>> --- a/net/ipv4/ip_output.c
>> +++ b/net/ipv4/ip_output.c
>> @@ -1050,8 +1050,14 @@ static int __ip_append_data(struct sock *sk,
>>   
>>   	hold_tskey = cork->tx_flags & SKBTX_ANY_TSTAMP &&
>>   		     READ_ONCE(sk->sk_tsflags) & SOF_TIMESTAMPING_OPT_ID;
>> -	if (hold_tskey)
>> -		tskey = atomic_inc_return(&sk->sk_tskey) - 1;
>> +	if (hold_tskey) {
>> +                if (cork->ts_opt_id) {
>> +                        hold_tskey = false;
>> +                        tskey = cork->ts_opt_id;
>> +                } else {
>> +                        tskey = atomic_inc_return(&sk->sk_tskey) - 1;
>> +                }
>> +	}
>>   
>>   	/* So, what's going on in the loop below?
>>   	 *
>> @@ -1324,6 +1330,7 @@ static int ip_setup_cork(struct sock *sk, struct inet_cork *cork,
>>   	cork->mark = ipc->sockc.mark;
>>   	cork->priority = ipc->priority;
>>   	cork->transmit_time = ipc->sockc.transmit_time;
>> +	cork->ts_opt_id = ipc->sockc.ts_opt_id;
>>   	cork->tx_flags = 0;
>>   	sock_tx_timestamp(sk, ipc->sockc.tsflags, &cork->tx_flags);
>>   
>> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
>> index f26841f1490f..602064250546 100644
>> --- a/net/ipv6/ip6_output.c
>> +++ b/net/ipv6/ip6_output.c
>> @@ -1401,6 +1401,7 @@ static int ip6_setup_cork(struct sock *sk, struct inet_cork_full *cork,
>>   	cork->base.gso_size = ipc6->gso_size;
>>   	cork->base.tx_flags = 0;
>>   	cork->base.mark = ipc6->sockc.mark;
>> +	cork->base.ts_opt_id = ipc6->sockc.ts_opt_id;
>>   	sock_tx_timestamp(sk, ipc6->sockc.tsflags, &cork->base.tx_flags);
>>   
>>   	cork->base.length = 0;
>> @@ -1545,8 +1546,14 @@ static int __ip6_append_data(struct sock *sk,
>>   
>>   	hold_tskey = cork->tx_flags & SKBTX_ANY_TSTAMP &&
>>   		     READ_ONCE(sk->sk_tsflags) & SOF_TIMESTAMPING_OPT_ID;
>> -	if (hold_tskey)
>> -		tskey = atomic_inc_return(&sk->sk_tskey) - 1;
>> +	if (hold_tskey) {
>> +		if (cork->ts_opt_id) {
>> +			hold_tskey = false;
>> +			tskey = cork->ts_opt_id;
>> +		} else {
>> +			tskey = atomic_inc_return(&sk->sk_tskey) - 1;
>> +		}
>> +	}
>>   
>>   	/*
>>   	 * Let's try using as much space as possible.
>> -- 
>> 2.43.5
>>
> 
>
Willem de Bruijn Aug. 29, 2024, 2:27 p.m. UTC | #3
Vadim Fedorenko wrote:
> On 29/08/2024 14:31, Willem de Bruijn wrote:
> > Vadim Fedorenko wrote:
> >> SOF_TIMESTAMPING_OPT_ID socket option flag gives a way to correlate TX
> >> timestamps
> > 
> > +1 on the feature. Few minor points only.
> > 
> > Not a hard requirement, but would be nice if there was a test,
> > e.g., as a tools/testing/../txtimestamp.c extension.
> 
> Sure, I'll add some tests in the next version.
> 
> 
> >> and packets sent via socket. Unfortunately, there is no way
> >> to reliably predict socket timestamp ID value in case of error returned
> >> by sendmsg [1].
> > 
> > Might be good to copy more context from the discussion to explain why
> > reliable OPT_ID is infeasible. For UDP, it is as simple as lockless
> > transmit. For RAW, things like MSG_MORE come into play.
> 
> Ok, I'll add it, thanks!
> 
> >> This patch adds new control message type to give user-space
> >> software an opportunity to control the mapping between packets and
> >> values by providing ID with each sendmsg. This works fine for UDP
> >> sockets only, and explicit check is added to control message parser.
> >> Also, there is no easy way to use 0 as provided ID, so this is value
> >> treated as invalid.
> > 
> > This is because the code branches on non-zero value in the cookie,
> > else uses ts_key. Please make this explicit. Or perhaps better, add a
> > bit in the cookie so that the full 32-bit space can be used.
> 
> Adding a bit in the cookie is not enough, I have to add another flag to
> inet_cork. And we are running out of space for tx flags, 
> inet_cork::tx_flags is u8 and we have only 1 bit left for SKBTX* enum.
> Do you think it's OK to use this last bit for OPT_ID feature?

No, that space is particularly constrained in skb_shinfo.

Either a separate bit in inet_cork, or just keep as is.
Vadim Fedorenko Aug. 29, 2024, 3 p.m. UTC | #4
On 29/08/2024 15:27, Willem de Bruijn wrote:
> Vadim Fedorenko wrote:
>> On 29/08/2024 14:31, Willem de Bruijn wrote:
>>> Vadim Fedorenko wrote:
>>>> SOF_TIMESTAMPING_OPT_ID socket option flag gives a way to correlate TX
>>>> timestamps
>>>
>>> +1 on the feature. Few minor points only.
>>>
>>> Not a hard requirement, but would be nice if there was a test,
>>> e.g., as a tools/testing/../txtimestamp.c extension.
>>
>> Sure, I'll add some tests in the next version.
>>
>>
>>>> and packets sent via socket. Unfortunately, there is no way
>>>> to reliably predict socket timestamp ID value in case of error returned
>>>> by sendmsg [1].
>>>
>>> Might be good to copy more context from the discussion to explain why
>>> reliable OPT_ID is infeasible. For UDP, it is as simple as lockless
>>> transmit. For RAW, things like MSG_MORE come into play.
>>
>> Ok, I'll add it, thanks!
>>
>>>> This patch adds new control message type to give user-space
>>>> software an opportunity to control the mapping between packets and
>>>> values by providing ID with each sendmsg. This works fine for UDP
>>>> sockets only, and explicit check is added to control message parser.
>>>> Also, there is no easy way to use 0 as provided ID, so this is value
>>>> treated as invalid.
>>>
>>> This is because the code branches on non-zero value in the cookie,
>>> else uses ts_key. Please make this explicit. Or perhaps better, add a
>>> bit in the cookie so that the full 32-bit space can be used.
>>
>> Adding a bit in the cookie is not enough, I have to add another flag to
>> inet_cork. And we are running out of space for tx flags,
>> inet_cork::tx_flags is u8 and we have only 1 bit left for SKBTX* enum.
>> Do you think it's OK to use this last bit for OPT_ID feature?
> 
> No, that space is particularly constrained in skb_shinfo.
> 
> Either a separate bit in inet_cork, or just keep as is.

Ok, got it. I'll add IPCORK_TS_OPT_ID flag then. Thanks!
diff mbox series

Patch

diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index 394c3b66065e..7e8545311557 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -174,6 +174,7 @@  struct inet_cork {
 	__s16			tos;
 	char			priority;
 	__u16			gso_size;
+	u32			ts_opt_id;
 	u64			transmit_time;
 	u32			mark;
 };
diff --git a/include/net/sock.h b/include/net/sock.h
index f51d61fab059..73e21dad5660 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1794,6 +1794,7 @@  struct sockcm_cookie {
 	u64 transmit_time;
 	u32 mark;
 	u32 tsflags;
+	u32 ts_opt_id;
 };
 
 static inline void sockcm_init(struct sockcm_cookie *sockc,
diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
index 8ce8a39a1e5f..db3df3e74b01 100644
--- a/include/uapi/asm-generic/socket.h
+++ b/include/uapi/asm-generic/socket.h
@@ -135,6 +135,8 @@ 
 #define SO_PASSPIDFD		76
 #define SO_PEERPIDFD		77
 
+#define SCM_TS_OPT_ID		78
+
 #if !defined(__KERNEL__)
 
 #if __BITS_PER_LONG == 64 || (defined(__x86_64__) && defined(__ILP32__))
diff --git a/net/core/sock.c b/net/core/sock.c
index 468b1239606c..918cb6a0dcba 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2859,6 +2859,20 @@  int __sock_cmsg_send(struct sock *sk, struct cmsghdr *cmsg,
 			return -EINVAL;
 		sockc->transmit_time = get_unaligned((u64 *)CMSG_DATA(cmsg));
 		break;
+	case SCM_TS_OPT_ID:
+		/* allow this option for UDP sockets only */
+		if (!sk_is_udp(sk))
+			return -EINVAL;
+		tsflags = READ_ONCE(sk->sk_tsflags);
+		if (!(tsflags & SOF_TIMESTAMPING_OPT_ID))
+			return -EINVAL;
+		if (cmsg->cmsg_len != CMSG_LEN(sizeof(u32)))
+			return -EINVAL;
+		sockc->ts_opt_id = get_unaligned((u32 *)CMSG_DATA(cmsg));
+		/* do not allow 0 as packet id for timestamp */
+		if (!sockc->ts_opt_id)
+			return -EINVAL;
+		break;
 	/* SCM_RIGHTS and SCM_CREDENTIALS are semantically in SOL_UNIX. */
 	case SCM_RIGHTS:
 	case SCM_CREDENTIALS:
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index b90d0f78ac80..f1e6695cafd2 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1050,8 +1050,14 @@  static int __ip_append_data(struct sock *sk,
 
 	hold_tskey = cork->tx_flags & SKBTX_ANY_TSTAMP &&
 		     READ_ONCE(sk->sk_tsflags) & SOF_TIMESTAMPING_OPT_ID;
-	if (hold_tskey)
-		tskey = atomic_inc_return(&sk->sk_tskey) - 1;
+	if (hold_tskey) {
+                if (cork->ts_opt_id) {
+                        hold_tskey = false;
+                        tskey = cork->ts_opt_id;
+                } else {
+                        tskey = atomic_inc_return(&sk->sk_tskey) - 1;
+                }
+	}
 
 	/* So, what's going on in the loop below?
 	 *
@@ -1324,6 +1330,7 @@  static int ip_setup_cork(struct sock *sk, struct inet_cork *cork,
 	cork->mark = ipc->sockc.mark;
 	cork->priority = ipc->priority;
 	cork->transmit_time = ipc->sockc.transmit_time;
+	cork->ts_opt_id = ipc->sockc.ts_opt_id;
 	cork->tx_flags = 0;
 	sock_tx_timestamp(sk, ipc->sockc.tsflags, &cork->tx_flags);
 
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index f26841f1490f..602064250546 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1401,6 +1401,7 @@  static int ip6_setup_cork(struct sock *sk, struct inet_cork_full *cork,
 	cork->base.gso_size = ipc6->gso_size;
 	cork->base.tx_flags = 0;
 	cork->base.mark = ipc6->sockc.mark;
+	cork->base.ts_opt_id = ipc6->sockc.ts_opt_id;
 	sock_tx_timestamp(sk, ipc6->sockc.tsflags, &cork->base.tx_flags);
 
 	cork->base.length = 0;
@@ -1545,8 +1546,14 @@  static int __ip6_append_data(struct sock *sk,
 
 	hold_tskey = cork->tx_flags & SKBTX_ANY_TSTAMP &&
 		     READ_ONCE(sk->sk_tsflags) & SOF_TIMESTAMPING_OPT_ID;
-	if (hold_tskey)
-		tskey = atomic_inc_return(&sk->sk_tskey) - 1;
+	if (hold_tskey) {
+		if (cork->ts_opt_id) {
+			hold_tskey = false;
+			tskey = cork->ts_opt_id;
+		} else {
+			tskey = atomic_inc_return(&sk->sk_tskey) - 1;
+		}
+	}
 
 	/*
 	 * Let's try using as much space as possible.