mbox series

[net-next,v2,0/2] timestamp: control SOF_TIMESTAMPING_RX_SOFTWARE feature per socket

Message ID 20240828160145.68805-1-kerneljasonxing@gmail.com (mailing list archive)
Headers show
Series timestamp: control SOF_TIMESTAMPING_RX_SOFTWARE feature per socket | expand

Message

Jason Xing Aug. 28, 2024, 4:01 p.m. UTC
From: Jason Xing <kernelxing@tencent.com>

Prior to this series, when one socket is set SOF_TIMESTAMPING_RX_SOFTWARE
which measn the whole system turns on this button, other sockets that only
have SOF_TIMESTAMPING_SOFTWARE will be affected and then print the rx
timestamp information even without SOF_TIMESTAMPING_RX_SOFTWARE flag.
In such a case, the rxtimestamp.c selftest surely fails, please see
testcase 6.

In a normal case, if we only set SOF_TIMESTAMPING_SOFTWARE flag, we
can't get the rx timestamp because there is no path leading to turn on
netstamp_needed_key button in net_enable_timestamp(). That is to say, if
the user only sets SOF_TIMESTAMPING_SOFTWARE, we don't expect we are
able to fetch the timestamp from the skb.

More than this, we can find there are some other ways to turn on
netstamp_needed_key, which will happenly allow users to get tstamp in
the receive path. Please see net_enable_timestamp().

How to solve it?

setsockopt interface is used to control each socket separately but in
this case it is affected by other sockets. For timestamp itself, it's
not feasible to convert netstamp_needed_key into a per-socket button
because when the receive stack just handling the skb from driver doesn't
know which socket the skb belongs to.

According to the original design, we should not use both generation flag
(SOF_TIMESTAMPING_RX_SOFTWARE) and report flag (SOF_TIMESTAMPING_SOFTWARE)
together to test if the application is allowed to receive the timestamp
report in the receive path. But it doesn't hold for receive timestamping
case. We have to make an exception.

So we have to test the generation flag when the applications do recvmsg:
if we set both of flags, it means we want the timestamp; if not, it means
we don't expect to see the timestamp even the skb carries.

As we can see, this patch makes the SOF_TIMESTAMPING_RX_SOFTWARE under
setsockopt control. And it's a per-socket fine-grained now.

v2
Link: https://lore.kernel.org/all/20240825152440.93054-1-kerneljasonxing@gmail.com/
Discussed with Willem
1. update the documentation accordingly
2. add more comments in each patch
3. remove the previous test statements in __sock_recv_timestamp()

Jason Xing (2):
  tcp: make SOF_TIMESTAMPING_RX_SOFTWARE feature per socket
  net: make SOF_TIMESTAMPING_RX_SOFTWARE feature per socket

 Documentation/networking/timestamping.rst |  7 +++++++
 include/net/sock.h                        |  7 ++++---
 net/bluetooth/hci_sock.c                  |  4 ++--
 net/core/sock.c                           |  2 +-
 net/ipv4/ip_sockglue.c                    |  2 +-
 net/ipv4/ping.c                           |  2 +-
 net/ipv4/tcp.c                            | 11 +++++++++--
 net/ipv6/datagram.c                       |  4 ++--
 net/l2tp/l2tp_ip.c                        |  2 +-
 net/l2tp/l2tp_ip6.c                       |  2 +-
 net/nfc/llcp_sock.c                       |  2 +-
 net/rxrpc/recvmsg.c                       |  2 +-
 net/socket.c                              | 11 ++++++++---
 net/unix/af_unix.c                        |  2 +-
 14 files changed, 40 insertions(+), 20 deletions(-)

Comments

Willem de Bruijn Aug. 29, 2024, 2:14 p.m. UTC | #1
Jason Xing wrote:
> From: Jason Xing <kernelxing@tencent.com>
> 
> Prior to this series, when one socket is set SOF_TIMESTAMPING_RX_SOFTWARE
> which measn the whole system turns on this button, other sockets that only
> have SOF_TIMESTAMPING_SOFTWARE will be affected and then print the rx
> timestamp information even without SOF_TIMESTAMPING_RX_SOFTWARE flag.
> In such a case, the rxtimestamp.c selftest surely fails, please see
> testcase 6.
> 
> In a normal case, if we only set SOF_TIMESTAMPING_SOFTWARE flag, we
> can't get the rx timestamp because there is no path leading to turn on
> netstamp_needed_key button in net_enable_timestamp(). That is to say, if
> the user only sets SOF_TIMESTAMPING_SOFTWARE, we don't expect we are
> able to fetch the timestamp from the skb.

I already happened to stumble upon a counterexample.

The below code requests software timestamps, but does not set the
generate flag. I suspect because they assume a PTP daemon (sfptpd)
running that has already enabled that.

https://github.com/Xilinx-CNS/onload/blob/master/src/tests/onload/hwtimestamping/rx_timestamping.c

I suspect that there will be more of such examples in practice. In
which case we should scuttle this. Please do a search online for
SOF_TIMESTAMPING_SOFTWARE to scan for this pattern.
 
> More than this, we can find there are some other ways to turn on
> netstamp_needed_key, which will happenly allow users to get tstamp in
> the receive path. Please see net_enable_timestamp().
> 
> How to solve it?
> 
> setsockopt interface is used to control each socket separately but in
> this case it is affected by other sockets. For timestamp itself, it's
> not feasible to convert netstamp_needed_key into a per-socket button
> because when the receive stack just handling the skb from driver doesn't
> know which socket the skb belongs to.
> 
> According to the original design, we should not use both generation flag
> (SOF_TIMESTAMPING_RX_SOFTWARE) and report flag (SOF_TIMESTAMPING_SOFTWARE)
> together to test if the application is allowed to receive the timestamp
> report in the receive path. But it doesn't hold for receive timestamping
> case. We have to make an exception.
> 
> So we have to test the generation flag when the applications do recvmsg:
> if we set both of flags, it means we want the timestamp; if not, it means
> we don't expect to see the timestamp even the skb carries.
> 
> As we can see, this patch makes the SOF_TIMESTAMPING_RX_SOFTWARE under
> setsockopt control. And it's a per-socket fine-grained now.
> 
> v2
> Link: https://lore.kernel.org/all/20240825152440.93054-1-kerneljasonxing@gmail.com/
> Discussed with Willem
> 1. update the documentation accordingly
> 2. add more comments in each patch
> 3. remove the previous test statements in __sock_recv_timestamp()
> 
> Jason Xing (2):
>   tcp: make SOF_TIMESTAMPING_RX_SOFTWARE feature per socket
>   net: make SOF_TIMESTAMPING_RX_SOFTWARE feature per socket
> 
>  Documentation/networking/timestamping.rst |  7 +++++++
>  include/net/sock.h                        |  7 ++++---
>  net/bluetooth/hci_sock.c                  |  4 ++--
>  net/core/sock.c                           |  2 +-
>  net/ipv4/ip_sockglue.c                    |  2 +-
>  net/ipv4/ping.c                           |  2 +-
>  net/ipv4/tcp.c                            | 11 +++++++++--
>  net/ipv6/datagram.c                       |  4 ++--
>  net/l2tp/l2tp_ip.c                        |  2 +-
>  net/l2tp/l2tp_ip6.c                       |  2 +-
>  net/nfc/llcp_sock.c                       |  2 +-
>  net/rxrpc/recvmsg.c                       |  2 +-
>  net/socket.c                              | 11 ++++++++---
>  net/unix/af_unix.c                        |  2 +-
>  14 files changed, 40 insertions(+), 20 deletions(-)
> 
> -- 
> 2.37.3
>
Jason Xing Aug. 29, 2024, 3:27 p.m. UTC | #2
On Thu, Aug 29, 2024 at 10:14 PM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> Jason Xing wrote:
> > From: Jason Xing <kernelxing@tencent.com>
> >
> > Prior to this series, when one socket is set SOF_TIMESTAMPING_RX_SOFTWARE
> > which measn the whole system turns on this button, other sockets that only
> > have SOF_TIMESTAMPING_SOFTWARE will be affected and then print the rx
> > timestamp information even without SOF_TIMESTAMPING_RX_SOFTWARE flag.
> > In such a case, the rxtimestamp.c selftest surely fails, please see
> > testcase 6.
> >
> > In a normal case, if we only set SOF_TIMESTAMPING_SOFTWARE flag, we
> > can't get the rx timestamp because there is no path leading to turn on
> > netstamp_needed_key button in net_enable_timestamp(). That is to say, if
> > the user only sets SOF_TIMESTAMPING_SOFTWARE, we don't expect we are
> > able to fetch the timestamp from the skb.
>
> I already happened to stumble upon a counterexample.
>
> The below code requests software timestamps, but does not set the
> generate flag. I suspect because they assume a PTP daemon (sfptpd)
> running that has already enabled that.

To be honest, I took a quick search through the whole onload program
and then suspected the use of timestamp looks really weird.

1. I searched the SOF_TIMESTAMPING_RX_SOFTWARE flag and found there is
no other related place that actually uses it.
2. please also see the tx_timestamping.c file[1]. The author similarly
only turns on SOF_TIMESTAMPING_SOFTWARE report flag without turning on
any useful generation flag we are familiar with, like
SOF_TIMESTAMPING_TX_SOFTWARE, SOF_TIMESTAMPING_TX_SCHED,
SOF_TIMESTAMPING_TX_ACK.

[1]: https://github.com/Xilinx-CNS/onload/blob/master/src/tests/onload/hwtimestamping/tx_timestamping.c#L247

>
> https://github.com/Xilinx-CNS/onload/blob/master/src/tests/onload/hwtimestamping/rx_timestamping.c
>
> I suspect that there will be more of such examples in practice. In
> which case we should scuttle this. Please do a search online for
> SOF_TIMESTAMPING_SOFTWARE to scan for this pattern.

I feel that only the buggy program or some program particularly takes
advantage of the global netstamp_needed_key...

>
> > More than this, we can find there are some other ways to turn on
> > netstamp_needed_key, which will happenly allow users to get tstamp in
> > the receive path. Please see net_enable_timestamp().
> >
> > How to solve it?
> >
> > setsockopt interface is used to control each socket separately but in
> > this case it is affected by other sockets. For timestamp itself, it's
> > not feasible to convert netstamp_needed_key into a per-socket button
> > because when the receive stack just handling the skb from driver doesn't
> > know which socket the skb belongs to.
> >
> > According to the original design, we should not use both generation flag
> > (SOF_TIMESTAMPING_RX_SOFTWARE) and report flag (SOF_TIMESTAMPING_SOFTWARE)
> > together to test if the application is allowed to receive the timestamp
> > report in the receive path. But it doesn't hold for receive timestamping
> > case. We have to make an exception.
> >
> > So we have to test the generation flag when the applications do recvmsg:
> > if we set both of flags, it means we want the timestamp; if not, it means
> > we don't expect to see the timestamp even the skb carries.
> >
> > As we can see, this patch makes the SOF_TIMESTAMPING_RX_SOFTWARE under
> > setsockopt control. And it's a per-socket fine-grained now.
> >
> > v2
> > Link: https://lore.kernel.org/all/20240825152440.93054-1-kerneljasonxing@gmail.com/
> > Discussed with Willem
> > 1. update the documentation accordingly
> > 2. add more comments in each patch
> > 3. remove the previous test statements in __sock_recv_timestamp()
> >
> > Jason Xing (2):
> >   tcp: make SOF_TIMESTAMPING_RX_SOFTWARE feature per socket
> >   net: make SOF_TIMESTAMPING_RX_SOFTWARE feature per socket
> >
> >  Documentation/networking/timestamping.rst |  7 +++++++
> >  include/net/sock.h                        |  7 ++++---
> >  net/bluetooth/hci_sock.c                  |  4 ++--
> >  net/core/sock.c                           |  2 +-
> >  net/ipv4/ip_sockglue.c                    |  2 +-
> >  net/ipv4/ping.c                           |  2 +-
> >  net/ipv4/tcp.c                            | 11 +++++++++--
> >  net/ipv6/datagram.c                       |  4 ++--
> >  net/l2tp/l2tp_ip.c                        |  2 +-
> >  net/l2tp/l2tp_ip6.c                       |  2 +-
> >  net/nfc/llcp_sock.c                       |  2 +-
> >  net/rxrpc/recvmsg.c                       |  2 +-
> >  net/socket.c                              | 11 ++++++++---
> >  net/unix/af_unix.c                        |  2 +-
> >  14 files changed, 40 insertions(+), 20 deletions(-)
> >
> > --
> > 2.37.3
> >
>
>
Willem de Bruijn Aug. 29, 2024, 4:23 p.m. UTC | #3
Jason Xing wrote:
> On Thu, Aug 29, 2024 at 10:14 PM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
> >
> > Jason Xing wrote:
> > > From: Jason Xing <kernelxing@tencent.com>
> > >
> > > Prior to this series, when one socket is set SOF_TIMESTAMPING_RX_SOFTWARE
> > > which measn the whole system turns on this button, other sockets that only
> > > have SOF_TIMESTAMPING_SOFTWARE will be affected and then print the rx
> > > timestamp information even without SOF_TIMESTAMPING_RX_SOFTWARE flag.
> > > In such a case, the rxtimestamp.c selftest surely fails, please see
> > > testcase 6.
> > >
> > > In a normal case, if we only set SOF_TIMESTAMPING_SOFTWARE flag, we
> > > can't get the rx timestamp because there is no path leading to turn on
> > > netstamp_needed_key button in net_enable_timestamp(). That is to say, if
> > > the user only sets SOF_TIMESTAMPING_SOFTWARE, we don't expect we are
> > > able to fetch the timestamp from the skb.
> >
> > I already happened to stumble upon a counterexample.
> >
> > The below code requests software timestamps, but does not set the
> > generate flag. I suspect because they assume a PTP daemon (sfptpd)
> > running that has already enabled that.
> 
> To be honest, I took a quick search through the whole onload program
> and then suspected the use of timestamp looks really weird.
> 
> 1. I searched the SOF_TIMESTAMPING_RX_SOFTWARE flag and found there is
> no other related place that actually uses it.
> 2. please also see the tx_timestamping.c file[1]. The author similarly
> only turns on SOF_TIMESTAMPING_SOFTWARE report flag without turning on
> any useful generation flag we are familiar with, like
> SOF_TIMESTAMPING_TX_SOFTWARE, SOF_TIMESTAMPING_TX_SCHED,
> SOF_TIMESTAMPING_TX_ACK.
> 
> [1]: https://github.com/Xilinx-CNS/onload/blob/master/src/tests/onload/hwtimestamping/tx_timestamping.c#L247
> 
> >
> > https://github.com/Xilinx-CNS/onload/blob/master/src/tests/onload/hwtimestamping/rx_timestamping.c
> >
> > I suspect that there will be more of such examples in practice. In
> > which case we should scuttle this. Please do a search online for
> > SOF_TIMESTAMPING_SOFTWARE to scan for this pattern.
> 
> I feel that only the buggy program or some program particularly takes
> advantage of the global netstamp_needed_key...

My point is that I just happen to stumble on one open source example
of this behavior.

That is a strong indication that other applications may make the same
implicit assumption. Both open source, and the probably many more non
public users.

Rule #1 is to not break users.

Given that we even have proof that we would break users, we cannot
make this change, sorry.

A safer alternative is to define a new timestamp option flag that
opt-in enables this filter-if-SOF_TIMESTAMPING_RX_SOFTWARE is not
set behavior.
Jason Xing Aug. 29, 2024, 5:45 p.m. UTC | #4
On Fri, Aug 30, 2024 at 12:23 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> Jason Xing wrote:
> > On Thu, Aug 29, 2024 at 10:14 PM Willem de Bruijn
> > <willemdebruijn.kernel@gmail.com> wrote:
> > >
> > > Jason Xing wrote:
> > > > From: Jason Xing <kernelxing@tencent.com>
> > > >
> > > > Prior to this series, when one socket is set SOF_TIMESTAMPING_RX_SOFTWARE
> > > > which measn the whole system turns on this button, other sockets that only
> > > > have SOF_TIMESTAMPING_SOFTWARE will be affected and then print the rx
> > > > timestamp information even without SOF_TIMESTAMPING_RX_SOFTWARE flag.
> > > > In such a case, the rxtimestamp.c selftest surely fails, please see
> > > > testcase 6.
> > > >
> > > > In a normal case, if we only set SOF_TIMESTAMPING_SOFTWARE flag, we
> > > > can't get the rx timestamp because there is no path leading to turn on
> > > > netstamp_needed_key button in net_enable_timestamp(). That is to say, if
> > > > the user only sets SOF_TIMESTAMPING_SOFTWARE, we don't expect we are
> > > > able to fetch the timestamp from the skb.
> > >
> > > I already happened to stumble upon a counterexample.
> > >
> > > The below code requests software timestamps, but does not set the
> > > generate flag. I suspect because they assume a PTP daemon (sfptpd)
> > > running that has already enabled that.
> >
> > To be honest, I took a quick search through the whole onload program
> > and then suspected the use of timestamp looks really weird.
> >
> > 1. I searched the SOF_TIMESTAMPING_RX_SOFTWARE flag and found there is
> > no other related place that actually uses it.
> > 2. please also see the tx_timestamping.c file[1]. The author similarly
> > only turns on SOF_TIMESTAMPING_SOFTWARE report flag without turning on
> > any useful generation flag we are familiar with, like
> > SOF_TIMESTAMPING_TX_SOFTWARE, SOF_TIMESTAMPING_TX_SCHED,
> > SOF_TIMESTAMPING_TX_ACK.
> >
> > [1]: https://github.com/Xilinx-CNS/onload/blob/master/src/tests/onload/hwtimestamping/tx_timestamping.c#L247
> >
> > >
> > > https://github.com/Xilinx-CNS/onload/blob/master/src/tests/onload/hwtimestamping/rx_timestamping.c
> > >
> > > I suspect that there will be more of such examples in practice. In
> > > which case we should scuttle this. Please do a search online for
> > > SOF_TIMESTAMPING_SOFTWARE to scan for this pattern.
> >
> > I feel that only the buggy program or some program particularly takes
> > advantage of the global netstamp_needed_key...
>
> My point is that I just happen to stumble on one open source example
> of this behavior.
>
> That is a strong indication that other applications may make the same
> implicit assumption. Both open source, and the probably many more non
> public users.
>
> Rule #1 is to not break users.

Yes, I know it.

>
> Given that we even have proof that we would break users, we cannot
> make this change, sorry.

Okay. Your concern indeed makes sense. Sigh, I just finished the v3
patch series :S

>
> A safer alternative is to define a new timestamp option flag that
> opt-in enables this filter-if-SOF_TIMESTAMPING_RX_SOFTWARE is not
> set behavior.

At the first glance, It sounds like it's a little bit similar to
SOF_TIMESTAMPING_OPT_ID_TCP that is used to replace
SOF_TIMESTAMPING_OPT_ID in the bytestream case for robustness
consideration.

Are you suggesting that if we can use the new report flag combined
with SOF_TIMESTAMPING_SOFTWARE, the application will not get a rx
timestamp report, right? The new flag goes the opposite way compared
with SOF_TIMESTAMPING_RX_SOFTWARE, indicating we don't expect a rx sw
report.

If that is so, what would you recommend to name the new flag which is
a report flag (not a generation flag)? How about calling
"SOF_TIMESTAMPING_RX_SOFTWARE_CTRL". I tried, but my English
vocabulary doesn't help, sorry :(

Thanks,
Jason
Willem de Bruijn Aug. 29, 2024, 6:15 p.m. UTC | #5
Jason Xing wrote:
> On Fri, Aug 30, 2024 at 12:23 AM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
> >
> > Jason Xing wrote:
> > > On Thu, Aug 29, 2024 at 10:14 PM Willem de Bruijn
> > > <willemdebruijn.kernel@gmail.com> wrote:
> > > >
> > > > Jason Xing wrote:
> > > > > From: Jason Xing <kernelxing@tencent.com>
> > > > >
> > > > > Prior to this series, when one socket is set SOF_TIMESTAMPING_RX_SOFTWARE
> > > > > which measn the whole system turns on this button, other sockets that only
> > > > > have SOF_TIMESTAMPING_SOFTWARE will be affected and then print the rx
> > > > > timestamp information even without SOF_TIMESTAMPING_RX_SOFTWARE flag.
> > > > > In such a case, the rxtimestamp.c selftest surely fails, please see
> > > > > testcase 6.
> > > > >
> > > > > In a normal case, if we only set SOF_TIMESTAMPING_SOFTWARE flag, we
> > > > > can't get the rx timestamp because there is no path leading to turn on
> > > > > netstamp_needed_key button in net_enable_timestamp(). That is to say, if
> > > > > the user only sets SOF_TIMESTAMPING_SOFTWARE, we don't expect we are
> > > > > able to fetch the timestamp from the skb.
> > > >
> > > > I already happened to stumble upon a counterexample.
> > > >
> > > > The below code requests software timestamps, but does not set the
> > > > generate flag. I suspect because they assume a PTP daemon (sfptpd)
> > > > running that has already enabled that.
> > >
> > > To be honest, I took a quick search through the whole onload program
> > > and then suspected the use of timestamp looks really weird.
> > >
> > > 1. I searched the SOF_TIMESTAMPING_RX_SOFTWARE flag and found there is
> > > no other related place that actually uses it.
> > > 2. please also see the tx_timestamping.c file[1]. The author similarly
> > > only turns on SOF_TIMESTAMPING_SOFTWARE report flag without turning on
> > > any useful generation flag we are familiar with, like
> > > SOF_TIMESTAMPING_TX_SOFTWARE, SOF_TIMESTAMPING_TX_SCHED,
> > > SOF_TIMESTAMPING_TX_ACK.
> > >
> > > [1]: https://github.com/Xilinx-CNS/onload/blob/master/src/tests/onload/hwtimestamping/tx_timestamping.c#L247
> > >
> > > >
> > > > https://github.com/Xilinx-CNS/onload/blob/master/src/tests/onload/hwtimestamping/rx_timestamping.c
> > > >
> > > > I suspect that there will be more of such examples in practice. In
> > > > which case we should scuttle this. Please do a search online for
> > > > SOF_TIMESTAMPING_SOFTWARE to scan for this pattern.
> > >
> > > I feel that only the buggy program or some program particularly takes
> > > advantage of the global netstamp_needed_key...
> >
> > My point is that I just happen to stumble on one open source example
> > of this behavior.
> >
> > That is a strong indication that other applications may make the same
> > implicit assumption. Both open source, and the probably many more non
> > public users.
> >
> > Rule #1 is to not break users.
> 
> Yes, I know it.
> 
> >
> > Given that we even have proof that we would break users, we cannot
> > make this change, sorry.
> 
> Okay. Your concern indeed makes sense. Sigh, I just finished the v3
> patch series :S
> 
> >
> > A safer alternative is to define a new timestamp option flag that
> > opt-in enables this filter-if-SOF_TIMESTAMPING_RX_SOFTWARE is not
> > set behavior.
> 
> At the first glance, It sounds like it's a little bit similar to
> SOF_TIMESTAMPING_OPT_ID_TCP that is used to replace
> SOF_TIMESTAMPING_OPT_ID in the bytestream case for robustness
> consideration.
>
> Are you suggesting that if we can use the new report flag combined
> with SOF_TIMESTAMPING_SOFTWARE, the application will not get a rx
> timestamp report, right? The new flag goes the opposite way compared
> with SOF_TIMESTAMPING_RX_SOFTWARE, indicating we don't expect a rx sw
> report.
> 
> If that is so, what would you recommend to name the new flag which is
> a report flag (not a generation flag)? How about calling
> "SOF_TIMESTAMPING_RX_SOFTWARE_CTRL". I tried, but my English
> vocabulary doesn't help, sorry :(

Something like this?

@@ -947,6 +947,8 @@ void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
        memset(&tss, 0, sizeof(tss));
        tsflags = READ_ONCE(sk->sk_tsflags);
        if ((tsflags & SOF_TIMESTAMPING_SOFTWARE) &&
+           (tsflags & SOF_TIMESTAMPING_RX_SOFTWARE ||
+            !tsflags & SOF_TIMESTAMPING_OPT_RX_SOFTWARE_FILTER)
Jason Xing Aug. 29, 2024, 6:31 p.m. UTC | #6
On Fri, Aug 30, 2024 at 2:15 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> Jason Xing wrote:
> > On Fri, Aug 30, 2024 at 12:23 AM Willem de Bruijn
> > <willemdebruijn.kernel@gmail.com> wrote:
> > >
> > > Jason Xing wrote:
> > > > On Thu, Aug 29, 2024 at 10:14 PM Willem de Bruijn
> > > > <willemdebruijn.kernel@gmail.com> wrote:
> > > > >
> > > > > Jason Xing wrote:
> > > > > > From: Jason Xing <kernelxing@tencent.com>
> > > > > >
> > > > > > Prior to this series, when one socket is set SOF_TIMESTAMPING_RX_SOFTWARE
> > > > > > which measn the whole system turns on this button, other sockets that only
> > > > > > have SOF_TIMESTAMPING_SOFTWARE will be affected and then print the rx
> > > > > > timestamp information even without SOF_TIMESTAMPING_RX_SOFTWARE flag.
> > > > > > In such a case, the rxtimestamp.c selftest surely fails, please see
> > > > > > testcase 6.
> > > > > >
> > > > > > In a normal case, if we only set SOF_TIMESTAMPING_SOFTWARE flag, we
> > > > > > can't get the rx timestamp because there is no path leading to turn on
> > > > > > netstamp_needed_key button in net_enable_timestamp(). That is to say, if
> > > > > > the user only sets SOF_TIMESTAMPING_SOFTWARE, we don't expect we are
> > > > > > able to fetch the timestamp from the skb.
> > > > >
> > > > > I already happened to stumble upon a counterexample.
> > > > >
> > > > > The below code requests software timestamps, but does not set the
> > > > > generate flag. I suspect because they assume a PTP daemon (sfptpd)
> > > > > running that has already enabled that.
> > > >
> > > > To be honest, I took a quick search through the whole onload program
> > > > and then suspected the use of timestamp looks really weird.
> > > >
> > > > 1. I searched the SOF_TIMESTAMPING_RX_SOFTWARE flag and found there is
> > > > no other related place that actually uses it.
> > > > 2. please also see the tx_timestamping.c file[1]. The author similarly
> > > > only turns on SOF_TIMESTAMPING_SOFTWARE report flag without turning on
> > > > any useful generation flag we are familiar with, like
> > > > SOF_TIMESTAMPING_TX_SOFTWARE, SOF_TIMESTAMPING_TX_SCHED,
> > > > SOF_TIMESTAMPING_TX_ACK.
> > > >
> > > > [1]: https://github.com/Xilinx-CNS/onload/blob/master/src/tests/onload/hwtimestamping/tx_timestamping.c#L247
> > > >
> > > > >
> > > > > https://github.com/Xilinx-CNS/onload/blob/master/src/tests/onload/hwtimestamping/rx_timestamping.c
> > > > >
> > > > > I suspect that there will be more of such examples in practice. In
> > > > > which case we should scuttle this. Please do a search online for
> > > > > SOF_TIMESTAMPING_SOFTWARE to scan for this pattern.
> > > >
> > > > I feel that only the buggy program or some program particularly takes
> > > > advantage of the global netstamp_needed_key...
> > >
> > > My point is that I just happen to stumble on one open source example
> > > of this behavior.
> > >
> > > That is a strong indication that other applications may make the same
> > > implicit assumption. Both open source, and the probably many more non
> > > public users.
> > >
> > > Rule #1 is to not break users.
> >
> > Yes, I know it.
> >
> > >
> > > Given that we even have proof that we would break users, we cannot
> > > make this change, sorry.
> >
> > Okay. Your concern indeed makes sense. Sigh, I just finished the v3
> > patch series :S
> >
> > >
> > > A safer alternative is to define a new timestamp option flag that
> > > opt-in enables this filter-if-SOF_TIMESTAMPING_RX_SOFTWARE is not
> > > set behavior.
> >
> > At the first glance, It sounds like it's a little bit similar to
> > SOF_TIMESTAMPING_OPT_ID_TCP that is used to replace
> > SOF_TIMESTAMPING_OPT_ID in the bytestream case for robustness
> > consideration.
> >
> > Are you suggesting that if we can use the new report flag combined
> > with SOF_TIMESTAMPING_SOFTWARE, the application will not get a rx
> > timestamp report, right? The new flag goes the opposite way compared
> > with SOF_TIMESTAMPING_RX_SOFTWARE, indicating we don't expect a rx sw
> > report.
> >
> > If that is so, what would you recommend to name the new flag which is
> > a report flag (not a generation flag)? How about calling
> > "SOF_TIMESTAMPING_RX_SOFTWARE_CTRL". I tried, but my English
> > vocabulary doesn't help, sorry :(
>
> Something like this?
>
> @@ -947,6 +947,8 @@ void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
>         memset(&tss, 0, sizeof(tss));
>         tsflags = READ_ONCE(sk->sk_tsflags);
>         if ((tsflags & SOF_TIMESTAMPING_SOFTWARE) &&
> +           (tsflags & SOF_TIMESTAMPING_RX_SOFTWARE ||
> +            !tsflags & SOF_TIMESTAMPING_OPT_RX_SOFTWARE_FILTER)
>

Yes, at least right now I think so. It can work, I can picture it in my mind.

In this way, we will face three possible situations:
1. setting SOF_TIMESTAMPING_SOFTWARE only, it behaves like before.
2. setting SOF_TIMESTAMPING_SOFTWARE|SOF_TIMESTAMPING_RX_SOFTWARE, it
will surely allow users to get the rx timestamp.
3. setting SOF_TIMESTAMPING_SOFTWARE|new_flag while the skb is
timestamped, it will stop reporting the _rx_ timestamp.

Having the new flag can provide a chance for users to stop reporting
the rx timestamp.

Well, It's too late for me (2:00 AM), sorry :( I need to do more tests
and then get back to you tomorrow.

Thanks for your good suggestion, Willem :) It's really a safer and
better suggestion. I have to sleep...

Jason