diff mbox series

[v2] tcp.7: Add description for TCP_FASTOPEN and TCP_FASTOPEN_CONNECT options

Message ID 20210917041702.167622-1-weiwan@google.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [v2] tcp.7: Add description for TCP_FASTOPEN and TCP_FASTOPEN_CONNECT options | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Wei Wang Sept. 17, 2021, 4:17 a.m. UTC
TCP_FASTOPEN socket option was added by:
commit 8336886f786fdacbc19b719c1f7ea91eb70706d4
TCP_FASTOPEN_CONNECT socket option was added by the following patch
series:
commit 065263f40f0972d5f1cd294bb0242bd5aa5f06b2
commit 25776aa943401662617437841b3d3ea4693ee98a
commit 19f6d3f3c8422d65b5e3d2162e30ef07c6e21ea2
commit 3979ad7e82dfe3fb94a51c3915e64ec64afa45c3
Add detailed description for these 2 options.
Also add descriptions for /proc entry tcp_fastopen and tcp_fastopen_key.

Signed-off-by: Wei Wang <weiwan@google.com>
Reviewed-by: Yuchung Cheng <ycheng@google.com>
---
Change in v2: corrected some format issues

 man7/tcp.7 | 110 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 110 insertions(+)

Comments

Alejandro Colomar Sept. 20, 2021, 7:50 p.m. UTC | #1
Hello Wei,

On 9/17/21 6:17 AM, Wei Wang wrote:
> TCP_FASTOPEN socket option was added by:
> commit 8336886f786fdacbc19b719c1f7ea91eb70706d4
> TCP_FASTOPEN_CONNECT socket option was added by the following patch
> series:
> commit 065263f40f0972d5f1cd294bb0242bd5aa5f06b2
> commit 25776aa943401662617437841b3d3ea4693ee98a
> commit 19f6d3f3c8422d65b5e3d2162e30ef07c6e21ea2
> commit 3979ad7e82dfe3fb94a51c3915e64ec64afa45c3
> Add detailed description for these 2 options.
> Also add descriptions for /proc entry tcp_fastopen and tcp_fastopen_key.
> 
> Signed-off-by: Wei Wang <weiwan@google.com>
> Reviewed-by: Yuchung Cheng <ycheng@google.com>

Thanks for the patch (and the review, Yuchung)!

Please see some comments below.

Cheers,

Alex

> ---
> Change in v2: corrected some format issues
> 
>   man7/tcp.7 | 110 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 110 insertions(+)
> 
> diff --git a/man7/tcp.7 b/man7/tcp.7
> index 0a7c61a37..5a6fa7f50 100644
> --- a/man7/tcp.7
> +++ b/man7/tcp.7
> @@ -423,6 +423,28 @@ option.
>   .\" Since 2.4.0-test7
>   Enable RFC\ 2883 TCP Duplicate SACK support.
>   .TP
> +.IR tcp_fastopen  " (Bitmask; default: 0x1; since Linux 3.7)"
> +Enables RFC\ 7413 Fast Open support.
> +The flag is used as a bitmap with the following values:
> +.RS
> +.IP 0x1
> +Enables client side Fast Open support
> +.IP 0x2
> +Enables server side Fast Open support
> +.IP 0x4
> +Allows client side to transmit data in SYN without Fast Open option
> +.IP 0x200
> +Allows server side to accept SYN data without Fast Open option
> +.IP 0x400
> +Enables Fast Open on all listeners without
> +.B TCP_FASTOPEN
> +socket option
> +.RE
> +.TP
> +.IR tcp_fastopen_key " (since Linux 3.7)"
> +Set server side RFC\ 7413 Fast Open key to generate Fast Open cookie
> +when server side Fast Open support is enabled.
> +.TP
>   .IR tcp_ecn " (Integer; default: see below; since Linux 2.4)"
>   .\" Since 2.4.0-test7
>   Enable RFC\ 3168 Explicit Congestion Notification.
> @@ -1202,6 +1224,94 @@ Bound the size of the advertised window to this value.
>   The kernel imposes a minimum size of SOCK_MIN_RCVBUF/2.
>   This option should not be used in code intended to be
>   portable.
> +.TP
> +.BR TCP_FASTOPEN " (since Linux 3.6)"
> +This option enables Fast Open (RFC\ 7413) on the listener socket.
> +The value specifies the maximum length of pending SYNs
> +(similar to the backlog argument in
> +.BR listen (2)).
> +Once enabled,
> +the listener socket grants the TCP Fast Open cookie on incoming
> +SYN with TCP Fast Open option.
> +.IP
> +More importantly it accepts the data in SYN with a valid Fast Open cookie
> +and responds SYN-ACK acknowledging both the data and the SYN sequence.
> +.BR accept (2)
> +returns a socket that is available for read and write when the handshake
> +has not completed yet.
> +Thus the data exchange can commence before the handshake completes.
> +This option requires enabling the server-side support on sysctl
> +.IR net.ipv4.tcp_fastopen
> +(see above).
> +For TCP Fast Open client-side support,
> +see
> +.BR send (2)
> +.B MSG_FASTOPEN
> +or
> +.B TCP_FASTOPEN_CONNECT
> +below.
> +.TP
> +.BR TCP_FASTOPEN_CONNECT " (since Linux 4.11)"
> +This option enables an alternative way to perform Fast Open on the active
> +side (client).
> +When this option is enabled,
> +.BR connect (2)
> +would behave differently depending if a Fast Open cookie is available for
> +the destination.
> +.IP
> +If a cookie is not available (i.e. first contact to the destination),
> +.BR connect (2)
> +behaves as usual by sending a SYN immediately,
> +except the SYN would include an empty Fast Open cookie option to solicit a
> +cookie.
> +.IP
> +If a cookie is available,
> +.BR connect (2)
> +would return 0 immediately but the SYN transmission is defered.
> +A subsequent
> +.BR write (2)
> +or
> +.BR sendmsg (2)
> +would trigger a SYN with data plus cookie in the Fast Open option.
> +In other words,
> +the actual connect operation is deferred until data is supplied.
> +.IP
> +.B Note:
> +While this option is designed for convenience,
> +enabling it does change the behaviors and might set new
> +.I errnos

typo?

errno values?

> +of socket calls.

The above is not very clear to me.

> +With cookie present,
> +.BR write (2)
> +/

Does this mean an "or"?  If so, prefer the "or".

> +.BR sendmsg (2)
> +must be called right after
> +.BR connect (2)
> +in order to send out SYN+data to complete 3WHS and establish connection.
> +Calling
> +.BR read (2)
> +right after
> +.BR connect (2)
> +without
> +.BR write (2)
> +will cause the blocking socket to be blocked forever.


> +The application should use either
> +.B TCP_FASTOPEN_CONNECT
> +or
> +.BR send (2)

This is not clear to me.  So TCP_FASTOPEN_CONNECT can use write(2) and 
sendmsg(2) (mentioned above), and TCP_FASTOPEN can only use send(2)?  Or 
what did you mean?

> +with
> +.B MSG_FASTOPEN ,
> +instead of both on the same connection.

 From "The application ...":
Does this have relation with the text just above it?  It appears to me 
to be a more generic statement that both options shouldn't be mixed, so 
maybe a new paragraph is more appropriate.

> +.IP
> +Here is the typical call flow with this new option:
> +  s = socket();
> +  setsockopt(s, IPPROTO_TCP, TCP_FASTOPEN_CONNECT, 1, ...);
> +  connect(s);
> +  write(s); // write() should always follow connect() in order to
> +            // trigger SYN to go out
> +  read(s)/write(s);
> +  ... > +  close(s);

See man-pages(7):

    Indentation of structure definitions, shell session  logs,  and
        so on
        When  structure  definitions, shell session logs, and so on
        are included in running  text,  indent  them  by  4  spaces
        (i.e.,  a  block  enclosed by .in +4n and .in), format them
        using the .EX and EE macros, and surround them  with  suit‐
        able paragraph markers (either .PP or .IP).  For example:

                .PP
                .in +4n
                .EX
                int
                main(int argc, char *argv[])
                {
                    return 0;
                }
                .EE
                .in
                .PP


>   .SS Sockets API
>   TCP provides limited support for out-of-band data,
>   in the form of (a single byte of) urgent data.
>
Wei Wang Sept. 24, 2021, 10:01 p.m. UTC | #2
On Mon, Sep 20, 2021 at 12:50 PM Alejandro Colomar (man-pages)
<alx.manpages@gmail.com> wrote:
>
> Hello Wei,
>
> On 9/17/21 6:17 AM, Wei Wang wrote:
> > TCP_FASTOPEN socket option was added by:
> > commit 8336886f786fdacbc19b719c1f7ea91eb70706d4
> > TCP_FASTOPEN_CONNECT socket option was added by the following patch
> > series:
> > commit 065263f40f0972d5f1cd294bb0242bd5aa5f06b2
> > commit 25776aa943401662617437841b3d3ea4693ee98a
> > commit 19f6d3f3c8422d65b5e3d2162e30ef07c6e21ea2
> > commit 3979ad7e82dfe3fb94a51c3915e64ec64afa45c3
> > Add detailed description for these 2 options.
> > Also add descriptions for /proc entry tcp_fastopen and tcp_fastopen_key.
> >
> > Signed-off-by: Wei Wang <weiwan@google.com>
> > Reviewed-by: Yuchung Cheng <ycheng@google.com>
>
> Thanks for the patch (and the review, Yuchung)!
>
> Please see some comments below.
>
> Cheers,
>
> Alex
>
> > ---
> > Change in v2: corrected some format issues
> >
> >   man7/tcp.7 | 110 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> >   1 file changed, 110 insertions(+)
> >
> > diff --git a/man7/tcp.7 b/man7/tcp.7
> > index 0a7c61a37..5a6fa7f50 100644
> > --- a/man7/tcp.7
> > +++ b/man7/tcp.7
> > @@ -423,6 +423,28 @@ option.
> >   .\" Since 2.4.0-test7
> >   Enable RFC\ 2883 TCP Duplicate SACK support.
> >   .TP
> > +.IR tcp_fastopen  " (Bitmask; default: 0x1; since Linux 3.7)"
> > +Enables RFC\ 7413 Fast Open support.
> > +The flag is used as a bitmap with the following values:
> > +.RS
> > +.IP 0x1
> > +Enables client side Fast Open support
> > +.IP 0x2
> > +Enables server side Fast Open support
> > +.IP 0x4
> > +Allows client side to transmit data in SYN without Fast Open option
> > +.IP 0x200
> > +Allows server side to accept SYN data without Fast Open option
> > +.IP 0x400
> > +Enables Fast Open on all listeners without
> > +.B TCP_FASTOPEN
> > +socket option
> > +.RE
> > +.TP
> > +.IR tcp_fastopen_key " (since Linux 3.7)"
> > +Set server side RFC\ 7413 Fast Open key to generate Fast Open cookie
> > +when server side Fast Open support is enabled.
> > +.TP
> >   .IR tcp_ecn " (Integer; default: see below; since Linux 2.4)"
> >   .\" Since 2.4.0-test7
> >   Enable RFC\ 3168 Explicit Congestion Notification.
> > @@ -1202,6 +1224,94 @@ Bound the size of the advertised window to this value.
> >   The kernel imposes a minimum size of SOCK_MIN_RCVBUF/2.
> >   This option should not be used in code intended to be
> >   portable.
> > +.TP
> > +.BR TCP_FASTOPEN " (since Linux 3.6)"
> > +This option enables Fast Open (RFC\ 7413) on the listener socket.
> > +The value specifies the maximum length of pending SYNs
> > +(similar to the backlog argument in
> > +.BR listen (2)).
> > +Once enabled,
> > +the listener socket grants the TCP Fast Open cookie on incoming
> > +SYN with TCP Fast Open option.
> > +.IP
> > +More importantly it accepts the data in SYN with a valid Fast Open cookie
> > +and responds SYN-ACK acknowledging both the data and the SYN sequence.
> > +.BR accept (2)
> > +returns a socket that is available for read and write when the handshake
> > +has not completed yet.
> > +Thus the data exchange can commence before the handshake completes.
> > +This option requires enabling the server-side support on sysctl
> > +.IR net.ipv4.tcp_fastopen
> > +(see above).
> > +For TCP Fast Open client-side support,
> > +see
> > +.BR send (2)
> > +.B MSG_FASTOPEN
> > +or
> > +.B TCP_FASTOPEN_CONNECT
> > +below.
> > +.TP
> > +.BR TCP_FASTOPEN_CONNECT " (since Linux 4.11)"
> > +This option enables an alternative way to perform Fast Open on the active
> > +side (client).
> > +When this option is enabled,
> > +.BR connect (2)
> > +would behave differently depending if a Fast Open cookie is available for
> > +the destination.
> > +.IP
> > +If a cookie is not available (i.e. first contact to the destination),
> > +.BR connect (2)
> > +behaves as usual by sending a SYN immediately,
> > +except the SYN would include an empty Fast Open cookie option to solicit a
> > +cookie.
> > +.IP
> > +If a cookie is available,
> > +.BR connect (2)
> > +would return 0 immediately but the SYN transmission is defered.
> > +A subsequent
> > +.BR write (2)
> > +or
> > +.BR sendmsg (2)
> > +would trigger a SYN with data plus cookie in the Fast Open option.
> > +In other words,
> > +the actual connect operation is deferred until data is supplied.
> > +.IP
> > +.B Note:
> > +While this option is designed for convenience,
> > +enabling it does change the behaviors and might set new
> > +.I errnos
>
> typo?
>
> errno values?
>
> > +of socket calls.
>
> The above is not very clear to me.
>
Will update.

> > +With cookie present,
> > +.BR write (2)
> > +/
>
> Does this mean an "or"?  If so, prefer the "or".
>
Yes. Ack.

> > +.BR sendmsg (2)
> > +must be called right after
> > +.BR connect (2)
> > +in order to send out SYN+data to complete 3WHS and establish connection.
> > +Calling
> > +.BR read (2)
> > +right after
> > +.BR connect (2)
> > +without
> > +.BR write (2)
> > +will cause the blocking socket to be blocked forever.
>
>
> > +The application should use either
> > +.B TCP_FASTOPEN_CONNECT
> > +or
> > +.BR send (2)
>
> This is not clear to me.  So TCP_FASTOPEN_CONNECT can use write(2) and
> sendmsg(2) (mentioned above), and TCP_FASTOPEN can only use send(2)?  Or
> what did you mean?
>

The application should either set TCP_FASTOPEN_CONNECT socket option
before calling write() or sendmsg(), or call write() or sendmsg() with
MSG_FASTOPEN flag directly, but not both at the same time.

> > +with
> > +.B MSG_FASTOPEN ,
> > +instead of both on the same connection.
>
>  From "The application ...":
> Does this have relation with the text just above it?  It appears to me
> to be a more generic statement that both options shouldn't be mixed, so
> maybe a new paragraph is more appropriate.
>

I think a new line does make sense, since this is a general statement
for TCP_FASTOPEN_CONNECT option.

> > +.IP
> > +Here is the typical call flow with this new option:
> > +  s = socket();
> > +  setsockopt(s, IPPROTO_TCP, TCP_FASTOPEN_CONNECT, 1, ...);
> > +  connect(s);
> > +  write(s); // write() should always follow connect() in order to
> > +            // trigger SYN to go out
> > +  read(s)/write(s);
> > +  ... > +  close(s);
>
> See man-pages(7):
>
>     Indentation of structure definitions, shell session  logs,  and
>         so on
>         When  structure  definitions, shell session logs, and so on
>         are included in running  text,  indent  them  by  4  spaces
>         (i.e.,  a  block  enclosed by .in +4n and .in), format them
>         using the .EX and EE macros, and surround them  with  suit‐
>         able paragraph markers (either .PP or .IP).  For example:
>
>                 .PP
>                 .in +4n
>                 .EX
>                 int
>                 main(int argc, char *argv[])
>                 {
>                     return 0;
>                 }
>                 .EE
>                 .in
>                 .PP
>
>

Ack. Thanks.
Will send out a new version with the above addressed.

> >   .SS Sockets API
> >   TCP provides limited support for out-of-band data,
> >   in the form of (a single byte of) urgent data.
> >
>
>
> --
> Alejandro Colomar
> Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/
> http://www.alejandro-colomar.es/
diff mbox series

Patch

diff --git a/man7/tcp.7 b/man7/tcp.7
index 0a7c61a37..5a6fa7f50 100644
--- a/man7/tcp.7
+++ b/man7/tcp.7
@@ -423,6 +423,28 @@  option.
 .\" Since 2.4.0-test7
 Enable RFC\ 2883 TCP Duplicate SACK support.
 .TP
+.IR tcp_fastopen  " (Bitmask; default: 0x1; since Linux 3.7)"
+Enables RFC\ 7413 Fast Open support.
+The flag is used as a bitmap with the following values:
+.RS
+.IP 0x1
+Enables client side Fast Open support
+.IP 0x2
+Enables server side Fast Open support
+.IP 0x4
+Allows client side to transmit data in SYN without Fast Open option
+.IP 0x200
+Allows server side to accept SYN data without Fast Open option
+.IP 0x400
+Enables Fast Open on all listeners without
+.B TCP_FASTOPEN
+socket option
+.RE
+.TP
+.IR tcp_fastopen_key " (since Linux 3.7)"
+Set server side RFC\ 7413 Fast Open key to generate Fast Open cookie
+when server side Fast Open support is enabled.
+.TP
 .IR tcp_ecn " (Integer; default: see below; since Linux 2.4)"
 .\" Since 2.4.0-test7
 Enable RFC\ 3168 Explicit Congestion Notification.
@@ -1202,6 +1224,94 @@  Bound the size of the advertised window to this value.
 The kernel imposes a minimum size of SOCK_MIN_RCVBUF/2.
 This option should not be used in code intended to be
 portable.
+.TP
+.BR TCP_FASTOPEN " (since Linux 3.6)"
+This option enables Fast Open (RFC\ 7413) on the listener socket.
+The value specifies the maximum length of pending SYNs
+(similar to the backlog argument in
+.BR listen (2)).
+Once enabled,
+the listener socket grants the TCP Fast Open cookie on incoming
+SYN with TCP Fast Open option.
+.IP
+More importantly it accepts the data in SYN with a valid Fast Open cookie
+and responds SYN-ACK acknowledging both the data and the SYN sequence.
+.BR accept (2)
+returns a socket that is available for read and write when the handshake
+has not completed yet.
+Thus the data exchange can commence before the handshake completes.
+This option requires enabling the server-side support on sysctl
+.IR net.ipv4.tcp_fastopen
+(see above).
+For TCP Fast Open client-side support,
+see
+.BR send (2)
+.B MSG_FASTOPEN
+or
+.B TCP_FASTOPEN_CONNECT
+below.
+.TP
+.BR TCP_FASTOPEN_CONNECT " (since Linux 4.11)"
+This option enables an alternative way to perform Fast Open on the active
+side (client).
+When this option is enabled,
+.BR connect (2)
+would behave differently depending if a Fast Open cookie is available for
+the destination.
+.IP
+If a cookie is not available (i.e. first contact to the destination),
+.BR connect (2)
+behaves as usual by sending a SYN immediately,
+except the SYN would include an empty Fast Open cookie option to solicit a
+cookie.
+.IP
+If a cookie is available,
+.BR connect (2)
+would return 0 immediately but the SYN transmission is defered.
+A subsequent
+.BR write (2)
+or
+.BR sendmsg (2)
+would trigger a SYN with data plus cookie in the Fast Open option.
+In other words,
+the actual connect operation is deferred until data is supplied.
+.IP
+.B Note:
+While this option is designed for convenience,
+enabling it does change the behaviors and might set new
+.I errnos
+of socket calls.
+With cookie present,
+.BR write (2)
+/
+.BR sendmsg (2)
+must be called right after
+.BR connect (2)
+in order to send out SYN+data to complete 3WHS and establish connection.
+Calling
+.BR read (2)
+right after
+.BR connect (2)
+without
+.BR write (2)
+will cause the blocking socket to be blocked forever.
+The application should use either
+.B TCP_FASTOPEN_CONNECT
+or
+.BR send (2)
+with
+.B MSG_FASTOPEN ,
+instead of both on the same connection.
+.IP
+Here is the typical call flow with this new option:
+  s = socket();
+  setsockopt(s, IPPROTO_TCP, TCP_FASTOPEN_CONNECT, 1, ...);
+  connect(s);
+  write(s); // write() should always follow connect() in order to
+            // trigger SYN to go out
+  read(s)/write(s);
+  ...
+  close(s);
 .SS Sockets API
 TCP provides limited support for out-of-band data,
 in the form of (a single byte of) urgent data.