diff mbox series

[01/41] rxrpc: Fix RTT determination to use PING ACKs as a source

Message ID 20231109154004.3317227-2-dhowells@redhat.com (mailing list archive)
State New, archived
Headers show
Series afs: Fix probe handling, server rotation and RO volume callback handling | expand

Commit Message

David Howells Nov. 9, 2023, 3:39 p.m. UTC
Fix RTT determination to use a PING ACK that is in marked as being in
response to a DATA packet as the response from which RTT can be calculated
from in lieu of a REQUESTED ACK.  The server may send the latter instead of
the former.

Fixes: 4700c4d80b7b ("rxrpc: Fix loss of RTT samples due to interposed ACK")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
---
 net/rxrpc/input.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Jeffrey E Altman Nov. 9, 2023, 5:16 p.m. UTC | #1
On 11/9/2023 10:39 AM, David Howells wrote:
> Fix RTT determination to use a PING ACK that is in marked as being in
> response to a DATA packet as the response from which RTT can be calculated
> from in lieu of a REQUESTED ACK.  The server may send the latter instead of
> the former.
>
> Fixes: 4700c4d80b7b ("rxrpc: Fix loss of RTT samples due to interposed ACK")
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Marc Dionne <marc.dionne@auristor.com>
> cc: linux-afs@lists.infradead.org
> ---
>   net/rxrpc/input.c | 4 ++++
>   1 file changed, 4 insertions(+)
>
> diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c
> index 030d64f282f3..fc0d404f3b91 100644
> --- a/net/rxrpc/input.c
> +++ b/net/rxrpc/input.c
> @@ -806,6 +806,10 @@ static void rxrpc_input_ack(struct rxrpc_call *call, struct sk_buff *skb)
>   		rxrpc_complete_rtt_probe(call, skb->tstamp, acked_serial, ack_serial,
>   					 rxrpc_rtt_rx_ping_response);
>   		break;
> +	case RXRPC_ACK_PING:
> +		if (acked_serial == 0)
> +			break;
> +		fallthrough;
>   	case RXRPC_ACK_REQUESTED:
>   		rxrpc_complete_rtt_probe(call, skb->tstamp, acked_serial, ack_serial,
>   					 rxrpc_rtt_rx_requested_ack);

David,

I do not believe the ack_reason matters within rxrpc_input_ack(). As 
long as the acked_serial is non-zero,
rxrpc_complete_rtt_probe() can be called to attempt to compute an RTT.   
If there is an exact match for the
acked_serial then an RTT can be computed and if acked_serial is later 
than the pending rtt probe, the probe
can be abandoned with the following caveats.

 1. Receiving an acked_serial that is later than the serial of the
    transmitted probe indicates that a packet
    transmitted after the probe was received first.  Or that reordering
    of the transmitted packets occurred.
    Or that the probe was never received by the peer; or that the peer's
    response to the probe was lost in
    transit.
 2. The serial number namespace is unsigned 32-bit shared across all of
    the call channels of the associated
    rx connection.  As the serial numbers will wrap the use of after()
    within rxrpc_complete_rtt_probe to
    compare their values is questionable.   If serial numbers will be
    compared in this manner then they
    need to be locally tracked and compared as unsigned 64-bit values
    where only the low 32-bits are
    transmitted on the wire and any wire serial number equal to zero is
    ignored.

Jeffrey Altman
David Howells Nov. 9, 2023, 10:06 p.m. UTC | #2
I'm going to drop this patch from this series for now and send it separately.

> I do not believe the ack_reason matters within rxrpc_input_ack(). As long as
> the acked_serial is non-zero,
> rxrpc_complete_rtt_probe() can be called to attempt to compute an RTT.   If
> there is an exact match for the
> acked_serial then an RTT can be computed and if acked_serial is later than the
> pending rtt probe, the probe
> can be abandoned with the following caveats.
> 
> 1. Receiving an acked_serial that is later than the serial of the
>    transmitted probe indicates that a packet
>    transmitted after the probe was received first.  Or that reordering
>    of the transmitted packets occurred.
>    Or that the probe was never received by the peer; or that the peer's
>    response to the probe was lost in
>    transit.
> 2. The serial number namespace is unsigned 32-bit shared across all of
>    the call channels of the associated
>    rx connection.  As the serial numbers will wrap the use of after()
>    within rxrpc_complete_rtt_probe to
>    compare their values is questionable.   If serial numbers will be
>    compared in this manner then they
>    need to be locally tracked and compared as unsigned 64-bit values
>    where only the low 32-bits are
>    transmitted on the wire and any wire serial number equal to zero is
>    ignored.

I do ignore ack.serial == 0 for this purpose.

I'm not sure how expanding it internally to 64-bits actually helps since the
upper 32 bits is not visible to the peer.

David
Jeffrey E Altman Nov. 10, 2023, 2:15 p.m. UTC | #3
On 11/9/2023 5:06 PM, David Howells wrote:
> I do not believe the ack_reason matters within rxrpc_input_ack(). As 
> long as
>> the acked_serial is non-zero,
>> rxrpc_complete_rtt_probe() can be called to attempt to compute an RTT.   If
>> there is an exact match for the
>> acked_serial then an RTT can be computed and if acked_serial is later than the
>> pending rtt probe, the probe
>> can be abandoned with the following caveats.
>>
>> 1. Receiving an acked_serial that is later than the serial of the
>>     transmitted probe indicates that a packet
>>     transmitted after the probe was received first.  Or that reordering
>>     of the transmitted packets occurred.
>>     Or that the probe was never received by the peer; or that the peer's
>>     response to the probe was lost in
>>     transit.
>> 2. The serial number namespace is unsigned 32-bit shared across all of
>>     the call channels of the associated
>>     rx connection.  As the serial numbers will wrap the use of after()
>>     within rxrpc_complete_rtt_probe to
>>     compare their values is questionable.   If serial numbers will be
>>     compared in this manner then they
>>     need to be locally tracked and compared as unsigned 64-bit values
>>     where only the low 32-bits are
>>     transmitted on the wire and any wire serial number equal to zero is
>>     ignored.
> I do ignore ack.serial == 0 for this purpose.

Zero has the special meaning - this ACK is not explicitly in response to 
a received packet.

However, as mentioned, the serial number counter wraps frequently and 
most RxRPC implementations
do not transition from serial 0xffffffff -> 0x00000001 when wrapping.

> I'm not sure how expanding it internally to 64-bits actually helps since the
> upper 32 bits is not visible to the peer.
rxrpc_complete_rtt_probe contains the following logic that relies on 
after() being able to detect
a serial number wrap.

         /* If a later serial is being acked, then mark this slot as
          * being available.
          */

         if (after(acked_serial, orig_serial)) {
             trace_rxrpc_rtt_rx(call, rxrpc_rtt_rx_obsolete, i,
                        orig_serial, acked_serial, 0, 0);
             clear_bit(i + RXRPC_CALL_RTT_PEND_SHIFT, &call->rtt_avail);
             smp_wmb();
             set_bit(i, &call->rtt_avail);
         }

Otherwise, acked_serial = 0x01 will be considered smaller than 
orig_serial = 0xfffffffe and the slot will not be marked available.

I will note that there is a similar problem with rxrpc_seq_t values 
which are u32 on the wire but which will wrap for calls that transmit 
more than approximately 5.5TB of data. Calls of this size are unlikely 
for a cache manager but are common for any service transmitting volume 
dumps.

Jeffrey Altman
Jeffrey E Altman Nov. 10, 2023, 4:12 p.m. UTC | #4
On 11/10/2023 9:15 AM, Jeffrey E Altman wrote:
>
>> I'm not sure how expanding it internally to 64-bits actually helps 
>> since the
>> upper 32 bits is not visible to the peer.
> rxrpc_complete_rtt_probe contains the following logic that relies on 
> after() being able to detect
> a serial number wrap.
>
>         /* If a later serial is being acked, then mark this slot as
>          * being available.
>          */
>
>         if (after(acked_serial, orig_serial)) {
>             trace_rxrpc_rtt_rx(call, rxrpc_rtt_rx_obsolete, i,
>                        orig_serial, acked_serial, 0, 0);
>             clear_bit(i + RXRPC_CALL_RTT_PEND_SHIFT, &call->rtt_avail);
>             smp_wmb();
>             set_bit(i, &call->rtt_avail);
>         }
>
> Otherwise, acked_serial = 0x01 will be considered smaller than 
> orig_serial = 0xfffffffe and the slot will not be marked available.
>
> I will note that there is a similar problem with rxrpc_seq_t values 
> which are u32 on the wire but which will wrap for calls that transmit 
> more than approximately 5.5TB of data. Calls of this size are unlikely 
> for a cache manager but are common for any service transmitting volume 
> dumps. 

I misread the definition of after() which is

     static inline bool after(u32 seq1, u32 seq2)
     {
             return (s32)(seq1 - seq2) > 0;
     }

This is sufficient to detect the serial number and sequence number wrapping.

What it doesn't provide is a method for rxrpc tracing and /proc file to 
report on the total number of packets that have been sent or received on 
a call or connection which might or might not be important depending 
upon the use case.

Jeffrey Altman
David Howells Nov. 10, 2023, 5:25 p.m. UTC | #5
Jeffrey E Altman <jaltman@auristor.com> wrote:

> > I do ignore ack.serial == 0 for this purpose.
> 
> Zero has the special meaning - this ACK is not explicitly in response to a
> received packet.
> 
> However, as mentioned, the serial number counter wraps frequently and most
> RxRPC implementations
> do not transition from serial 0xffffffff -> 0x00000001 when wrapping.

I don't skip zero serial numbers either.  I'm not sure whether it would be
better to do so.

> Otherwise, acked_serial = 0x01 will be considered smaller than orig_serial =
> 0xfffffffe and the slot will not be marked available.

As you mentioned in your follow up email, after() deals with that by casting
to signed, subtracting and then examining the result.

David
Jeffrey E Altman Nov. 10, 2023, 9:52 p.m. UTC | #6
On 11/10/2023 12:25 PM, David Howells wrote:
> Jeffrey E Altman<jaltman@auristor.com>  wrote:
>
>>> I do ignore ack.serial == 0 for this purpose.
>> Zero has the special meaning - this ACK is not explicitly in response to a
>> received packet.
>>
>> However, as mentioned, the serial number counter wraps frequently and most
>> RxRPC implementations
>> do not transition from serial 0xffffffff -> 0x00000001 when wrapping.
> I don't skip zero serial numbers either.  I'm not sure whether it would be
> better to do so.

If a DATA packet is sent with serial number zero and an ACK packet is 
sent in response to it
with the ack.serial field set to the DATA packet serial number (zero), 
then the receiver of the
ACK will be unable to compute an RTT from that DATA packet.   It will 
happen rarely but it
will happen.

Jeffrey
Jeffrey E Altman Nov. 10, 2023, 9:54 p.m. UTC | #7
On 11/10/2023 12:25 PM, David Howells wrote:
> Jeffrey E Altman<jaltman@auristor.com>  wrote:
>
>>> I do ignore ack.serial == 0 for this purpose.
>> Zero has the special meaning - this ACK is not explicitly in response to a
>> received packet.
>>
>> However, as mentioned, the serial number counter wraps frequently and most
>> RxRPC implementations
>> do not transition from serial 0xffffffff -> 0x00000001 when wrapping.
> I don't skip zero serial numbers either.  I'm not sure whether it would be
> better to do so.

If a DATA packet is sent with serial number zero and an ACK packet is 
sent in response to it
with the ack.serial field set to the DATA packet serial number (zero), 
then the receiver of the
ACK will be unable to compute an RTT from that DATA packet.   It will 
happen rarely but it
will happen.

Jeffrey
diff mbox series

Patch

diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c
index 030d64f282f3..fc0d404f3b91 100644
--- a/net/rxrpc/input.c
+++ b/net/rxrpc/input.c
@@ -806,6 +806,10 @@  static void rxrpc_input_ack(struct rxrpc_call *call, struct sk_buff *skb)
 		rxrpc_complete_rtt_probe(call, skb->tstamp, acked_serial, ack_serial,
 					 rxrpc_rtt_rx_ping_response);
 		break;
+	case RXRPC_ACK_PING:
+		if (acked_serial == 0)
+			break;
+		fallthrough;
 	case RXRPC_ACK_REQUESTED:
 		rxrpc_complete_rtt_probe(call, skb->tstamp, acked_serial, ack_serial,
 					 rxrpc_rtt_rx_requested_ack);