mbox series

[v2,00/35] net/tcp: Add TCP-AO support

Message ID 20220923201319.493208-1-dima@arista.com (mailing list archive)
Headers show
Series net/tcp: Add TCP-AO support | expand

Message

Dmitry Safonov Sept. 23, 2022, 8:12 p.m. UTC
Changes from v1:
- Building now with CONFIG_IPV6=n (kernel test robot <lkp@intel.com>)
- Added missing static declarations for local functions
  (kernel test robot <lkp@intel.com>)
- Addressed static analyzer and review comments by Dan Carpenter
  (thanks, they were very useful!)
- Fix elif without defined() for !CONFIG_TCP_AO
- Recursively build selftests/net/tcp_ao (Shuah Khan), patches in:
  https://lore.kernel.org/all/20220919201958.279545-1-dima@arista.com/T/#u
- Don't leak crypto_pool reference when TCP-MD5 key is modified/changed
- Add TCP-AO support for nettest.c and fcnal-test.sh
  (will be used for VRF testing in later versions)

Version 1: https://lore.kernel.org/all/20220818170005.747015-1-dima@arista.com/T/#u

In TODO (expect in next versions):
- selftests on older kernels (or with CONFIG_TCP_AO=n) should exit with
  SKIP, not FAIL
- Support VRFs in setsockopt()
- setsockopt() UAPI padding + a test that structures are of the same
  size on 32-bit as on 64-bit platforms
- IPv4-mapped-IPv6 addresses (might be working, but no selftest now)
- Remove CONFIG_TCP_AO dependency on CONFIG_TCP_MD5SIG
- Add TCP-AO static key
- Measure/benchmark TCP-AO and regular TCP connections
- setsockopt(TCP_REPAIR) with TCP-AO

This patchset implements the TCP-AO option as described in RFC5925. There
is a request from industry to move away from TCP-MD5SIG and it seems the time
is right to have a TCP-AO upstreamed. This TCP option is meant to replace
the TCP MD5 option and address its shortcomings. Specifically, it provides
more secure hashing, key rotation and support for long-lived connections
(see the summary of TCP-AO advantages over TCP-MD5 in (1.3) of RFC5925).
The patch series starts with six patches that are not specific to TCP-AO
but implement a general crypto facility that we thought is useful
to eliminate code duplication between TCP-MD5SIG and TCP-AO as well as other
crypto users. These six patches are being submitted separately in
a different patchset [1]. Including them here will show better the gain
in code sharing. Next are 18 patches that implement the actual TCP-AO option,
followed by patches implementing selftests.

The patch set was written as a collaboration of three authors (in alphabetical
order): Dmitry Safonov, Francesco Ruggeri and Salam Noureddine. Additional
credits should be given to Prasad Koya, who was involved in early prototyping
a few years back. There is also a separate submission done by Leonard Crestez
whom we thank for his efforts getting an implementation of RFC5925 submitted
for review upstream [2]. This is an independent implementation that makes
different design decisions.

For example, we chose a similar design to the TCP-MD5SIG implementation and
used setsockopts to program per-socket keys, avoiding the extra complexity
of managing a centralized key database in the kernel. A centralized database
in the kernel has dubious benefits since it doesn’t eliminate per-socket
setsockopts needed to specify which sockets need TCP-AO and what are the
currently preferred keys. It also complicates traffic key caching and
preventing deletion of in-use keys.

In this implementation, a centralized database of keys can be thought of
as living in user space and user applications would have to program those
keys on matching sockets. On the server side, the user application programs
keys (MKTS in TCP-AO nomenclature) on the listening socket for all peers that
are expected to connect. Prefix matching on the peer address is supported.
When a peer issues a successful connect, all the MKTs matching the IP address
of the peer are copied to the newly created socket. On the active side,
when a connect() is issued all MKTs that do not match the peer are deleted
from the socket since they will never match the peer. This implementation
uses three setsockopt()s for adding, deleting and modifying keys on a socket.
All three setsockopt()s have extensive sanity checks that prevent
inconsistencies in the keys on a given socket. A getsockopt() is provided
to get key information from any given socket.

Few things to note about this implementation:
- Traffic keys are cached for established connections avoiding the cost of
  such calculation for each packet received or sent.
- Great care has been taken to avoid deleting in-use MKTs
  as required by the RFC.
- Any crypto algorithm supported by the Linux kernel can be used
  to calculate packet hashes.
- Fastopen works with TCP-AO but hasn’t been tested extensively.
- Tested for interop with other major networking vendors (on linux-4.19),
  including testing for key rotation and long lived connections.

[1]: https://lore.kernel.org/all/20220726201600.1715505-1-dima@arista.com/
[2]: https://lore.kernel.org/all/cover.1658815925.git.cdleonard@gmail.com/

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Bob Gilligan <gilligan@arista.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: David Ahern <dsahern@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Francesco Ruggeri <fruggeri@arista.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Ivan Delalande <colona@arista.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Leonard Crestez <cdleonard@gmail.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Salam Noureddine <noureddine@arista.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: netdev@vger.kernel.org
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Dmitry Safonov (35):
  crypto: Introduce crypto_pool
  crypto_pool: Add crypto_pool_reserve_scratch()
  net/tcp: Separate tcp_md5sig_info allocation into
    tcp_md5sig_info_add()
  net/tcp: Disable TCP-MD5 static key on tcp_md5sig_info destruction
  net/tcp: Use crypto_pool for TCP-MD5
  net/ipv6: sr: Switch to using crypto_pool
  tcp: Add TCP-AO config and structures
  net/tcp: Introduce TCP_AO setsockopt()s
  net/tcp: Prevent TCP-MD5 with TCP-AO being set
  net/tcp: Calculate TCP-AO traffic keys
  net/tcp: Add TCP-AO sign to outgoing packets
  net/tcp: Add tcp_parse_auth_options()
  net/tcp: Add AO sign to RST packets
  net/tcp: Add TCP-AO sign to twsk
  net/tcp: Wire TCP-AO to request sockets
  net/tcp: Sign SYN-ACK segments with TCP-AO
  net/tcp: Verify inbound TCP-AO signed segments
  net/tcp: Add TCP-AO segments counters
  net/tcp: Add TCP-AO SNE support
  net/tcp: Add tcp_hash_fail() ratelimited logs
  net/tcp: Ignore specific ICMPs for TCP-AO connections
  net/tcp: Add option for TCP-AO to (not) hash header
  net/tcp: Add getsockopt(TCP_AO_GET)
  net/tcp: Allow asynchronous delete for TCP-AO keys (MKTs)
  selftests/net: Add TCP-AO library
  selftests/net: Verify that TCP-AO complies with ignoring ICMPs
  selftest/net: Add TCP-AO ICMPs accept test
  selftest/tcp-ao: Add a test for MKT matching
  selftest/tcp-ao: Add test for TCP-AO add setsockopt() command
  selftests/tcp-ao: Add TCP-AO + TCP-MD5 + no sign listen socket tests
  selftests/aolib: Add test/benchmark for removing MKTs
  selftests/nettest: Remove client_pw
  selftest/nettest: Rename md5_prefix* => auth_prefix*
  selftests/nettest: Add TCP-AO support
  selftests/fcnal-test.sh: Add TCP-AO tests

 crypto/Kconfig                                |   12 +
 crypto/Makefile                               |    1 +
 crypto/crypto_pool.c                          |  326 +++
 include/crypto/pool.h                         |   33 +
 include/linux/sockptr.h                       |   23 +
 include/linux/tcp.h                           |   24 +
 include/net/dropreason.h                      |   25 +
 include/net/seg6_hmac.h                       |    7 -
 include/net/tcp.h                             |  193 +-
 include/net/tcp_ao.h                          |  283 +++
 include/uapi/linux/snmp.h                     |    5 +
 include/uapi/linux/tcp.h                      |   62 +
 net/ipv4/Kconfig                              |   15 +-
 net/ipv4/Makefile                             |    1 +
 net/ipv4/proc.c                               |    5 +
 net/ipv4/tcp.c                                |  191 +-
 net/ipv4/tcp_ao.c                             | 1978 +++++++++++++++++
 net/ipv4/tcp_input.c                          |   94 +-
 net/ipv4/tcp_ipv4.c                           |  390 +++-
 net/ipv4/tcp_minisocks.c                      |   37 +-
 net/ipv4/tcp_output.c                         |  192 +-
 net/ipv6/Kconfig                              |    2 +-
 net/ipv6/Makefile                             |    1 +
 net/ipv6/seg6.c                               |    3 -
 net/ipv6/seg6_hmac.c                          |  204 +-
 net/ipv6/tcp_ao.c                             |  151 ++
 net/ipv6/tcp_ipv6.c                           |  327 ++-
 tools/testing/selftests/Makefile              |    1 +
 tools/testing/selftests/net/fcnal-test.sh     |  471 +++-
 tools/testing/selftests/net/nettest.c         |  217 +-
 tools/testing/selftests/net/tcp_ao/.gitignore |    2 +
 tools/testing/selftests/net/tcp_ao/Makefile   |   50 +
 .../selftests/net/tcp_ao/bench-lookups.c      |  403 ++++
 .../selftests/net/tcp_ao/connect-deny.c       |  217 ++
 tools/testing/selftests/net/tcp_ao/connect.c  |   81 +
 .../selftests/net/tcp_ao/icmps-accept.c       |    1 +
 .../selftests/net/tcp_ao/icmps-discard.c      |  447 ++++
 .../testing/selftests/net/tcp_ao/lib/aolib.h  |  333 +++
 .../selftests/net/tcp_ao/lib/netlink.c        |  341 +++
 tools/testing/selftests/net/tcp_ao/lib/proc.c |  267 +++
 .../testing/selftests/net/tcp_ao/lib/setup.c  |  297 +++
 tools/testing/selftests/net/tcp_ao/lib/sock.c |  294 +++
 .../testing/selftests/net/tcp_ao/lib/utils.c  |   30 +
 .../selftests/net/tcp_ao/setsockopt-closed.c  |  191 ++
 .../selftests/net/tcp_ao/unsigned-md5.c       |  483 ++++
 45 files changed, 8099 insertions(+), 612 deletions(-)
 create mode 100644 crypto/crypto_pool.c
 create mode 100644 include/crypto/pool.h
 create mode 100644 include/net/tcp_ao.h
 create mode 100644 net/ipv4/tcp_ao.c
 create mode 100644 net/ipv6/tcp_ao.c
 create mode 100644 tools/testing/selftests/net/tcp_ao/.gitignore
 create mode 100644 tools/testing/selftests/net/tcp_ao/Makefile
 create mode 100644 tools/testing/selftests/net/tcp_ao/bench-lookups.c
 create mode 100644 tools/testing/selftests/net/tcp_ao/connect-deny.c
 create mode 100644 tools/testing/selftests/net/tcp_ao/connect.c
 create mode 120000 tools/testing/selftests/net/tcp_ao/icmps-accept.c
 create mode 100644 tools/testing/selftests/net/tcp_ao/icmps-discard.c
 create mode 100644 tools/testing/selftests/net/tcp_ao/lib/aolib.h
 create mode 100644 tools/testing/selftests/net/tcp_ao/lib/netlink.c
 create mode 100644 tools/testing/selftests/net/tcp_ao/lib/proc.c
 create mode 100644 tools/testing/selftests/net/tcp_ao/lib/setup.c
 create mode 100644 tools/testing/selftests/net/tcp_ao/lib/sock.c
 create mode 100644 tools/testing/selftests/net/tcp_ao/lib/utils.c
 create mode 100644 tools/testing/selftests/net/tcp_ao/setsockopt-closed.c
 create mode 100644 tools/testing/selftests/net/tcp_ao/unsigned-md5.c


base-commit: bf682942cd26ce9cd5e87f73ae099b383041e782

Comments

Dmitry Safonov Sept. 23, 2022, 9:25 p.m. UTC | #1
On 9/23/22 21:12, Dmitry Safonov wrote:
> Changes from v1:
> - Building now with CONFIG_IPV6=n (kernel test robot <lkp@intel.com>)
> - Added missing static declarations for local functions
>   (kernel test robot <lkp@intel.com>)
> - Addressed static analyzer and review comments by Dan Carpenter
>   (thanks, they were very useful!)
> - Fix elif without defined() for !CONFIG_TCP_AO
> - Recursively build selftests/net/tcp_ao (Shuah Khan), patches in:
>   https://lore.kernel.org/all/20220919201958.279545-1-dima@arista.com/T/#u
> - Don't leak crypto_pool reference when TCP-MD5 key is modified/changed
> - Add TCP-AO support for nettest.c and fcnal-test.sh
>   (will be used for VRF testing in later versions)
> 
> Version 1: https://lore.kernel.org/all/20220818170005.747015-1-dima@arista.com/T/#u

I think it's worth answering the question: why am I continuing sending
patches for TCP-AO support when there's already another proposal? [1]
Pre-history how we end up with the second approach is here: [2]
TLDR; we had a customer and a deadline to deliver, I've given reviews to
Leonard, but in the end it seems to me what we got is worth submitting
as it's better in my view in many aspects.

The biggest differences between two proposals, that I care about
(design-decisions, not implementation details):

1. Per-netns TCP-AO keys vs per-socket TCP-AO keys. The reasons why this
proposal is using per-socket keys (that are added like TCP-MD5 keys with
setsockopt()) are:
- They scale better: you don't have to lookup in netns database for a
key. This is a major thing for Arista: we have to support customers that
want more than 1000 peers with possible multiple keys per-peer. This
scales well when the keys are split by design for each socket on every
established connection.
- TCP-AO doesn't require CAP_NET_ADMIN for usage.
- TCP-AO is not meant to be transparent (like ipsec tunnels) for
applications. The users are BGP applications which already know what
they need.
- Leonard's proposal has weird semantics when setsockopt() on some
socket will change keys on other sockets in that network namespace. It
should have been rather netlink-managed API or something of the kind if
the keys are per-netns.

2. This proposal seeks to do less in kernel space and leave more
decision-making to the userspace. It is another major disagreement with
Leonard's proposal, which seeks to add a key lifetime, key rotation
logic and all other business-logic details into the kernel, while here
those decisions are left for the userspace.
If I understood Leonard correctly, he is placing more things in kernel
to simplify migration for user applications from TCP-MD5 to TCP-AO. I
rather think that would be a job for a shared library if that's needed.
As per my perception (1) was also done to relieve userspace from the job
of removing an outdated key simultaneously from all users in netns,
while in this proposal this job is left for userspace to use available
IPC methods. Essentially, I think TCP-AO in kernel should do only
minimum that can't be done "reasonably" in userspace. By "reasonably" I
mean without moving the TCP-IP stack into userspace.

3. Re-using existing kernel code vs copy'n'pasting it, leaving
refactoring for later. I'm a big fan of reusing existing functions. I
think lesser amount of code in the end reduces the burden of maintenance
as well as simplifies the code (both reading and changing). I can see
Leonard's point of simplifying backports to stable releases that he
ships to customers, but I think any upstream proposal should add less
code and try reusing more.

4. Following RFC5925 closer to text. RFC says that rnext_key from the
peer MUST be respected, as well as that current_key MUST not be removed
on an established connection. In this proposal if the requirements of
RFC can be met, they are followed, rather than broken.

5. Using ahash instead of shash. If there's a hardware accelerator - why
not using it? This proposal uses crypto_ahash through per-CPU pool of
crypto requests (crypto_pool).

6. Hash algorithm UAPI: magic constants vs hash name as char *. This is
a thing I've asked Leonard multiple times and what he refuses to change
in his patches: let the UAPI have `char tcpa_alg_name[64]' and just pass
it to crypto_* layer. There is no need for #define MY_HASHING_ALGO 0x2
and another in-kernel array to convert the magic number to algorithm
string in order to pass it to crypto.
The algorithm names are flexible: we already have customer's request to
use other than RFC5926 required hashing algorithms. And I don't see any
value in this middle-layer. This is already what kernel does, see for
example, include/uapi/linux/xfrm.h, grep for alg_name.

7. Adding traffic keys from the beginning. The proposal would be
incomplete without having traffic keys: they are pre-calculated in this
proposal, so the TCP stack doesn't have to do hashing twice (first for
calculation of the traffic key) for every segment on established
connections. This proposal has them naturally per-socket.

I think those are the biggest differences in the approaches and they are
enough to submit a concurrent proposal. Salam, Francesco, please add if
I've missed any other disagreement or major architectural/design
difference in the proposals.

> In TODO (expect in next versions):
> - selftests on older kernels (or with CONFIG_TCP_AO=n) should exit with
>   SKIP, not FAIL
> - Support VRFs in setsockopt()
> - setsockopt() UAPI padding + a test that structures are of the same
>   size on 32-bit as on 64-bit platforms
> - IPv4-mapped-IPv6 addresses (might be working, but no selftest now)
> - Remove CONFIG_TCP_AO dependency on CONFIG_TCP_MD5SIG
> - Add TCP-AO static key
> - Measure/benchmark TCP-AO and regular TCP connections
> - setsockopt(TCP_REPAIR) with TCP-AO
[..]
[1]:
https://lore.kernel.org/linux-crypto/cover.1662361354.git.cdleonard@gmail.com/
[2]:
https://lore.kernel.org/all/8097c38e-e88e-66ad-74d3-2f4a9e3734f4@arista.com/T/#u

Thanks,
          Dmitry
David Ahern Sept. 27, 2022, 1:57 a.m. UTC | #2
On 9/23/22 2:25 PM, Dmitry Safonov wrote:
> On 9/23/22 21:12, Dmitry Safonov wrote:
>> Changes from v1:
>> - Building now with CONFIG_IPV6=n (kernel test robot <lkp@intel.com>)
>> - Added missing static declarations for local functions
>>   (kernel test robot <lkp@intel.com>)
>> - Addressed static analyzer and review comments by Dan Carpenter
>>   (thanks, they were very useful!)
>> - Fix elif without defined() for !CONFIG_TCP_AO
>> - Recursively build selftests/net/tcp_ao (Shuah Khan), patches in:
>>   https://lore.kernel.org/all/20220919201958.279545-1-dima@arista.com/T/#u
>> - Don't leak crypto_pool reference when TCP-MD5 key is modified/changed
>> - Add TCP-AO support for nettest.c and fcnal-test.sh
>>   (will be used for VRF testing in later versions)

The patchset is large enough, so deferring feature addons like VRF to a
follow on set is fine. That said, it needs to be clear that the UAPI
will support VRF from the onset, so please state how VRF support will be
added.

>>
>> Version 1: https://lore.kernel.org/all/20220818170005.747015-1-dima@arista.com/T/#u
> 
> I think it's worth answering the question: why am I continuing sending
> patches for TCP-AO support when there's already another proposal? [1]
> Pre-history how we end up with the second approach is here: [2]
> TLDR; we had a customer and a deadline to deliver, I've given reviews to
> Leonard, but in the end it seems to me what we got is worth submitting
> as it's better in my view in many aspects.
> 
> The biggest differences between two proposals, that I care about
> (design-decisions, not implementation details):

Thanks for the adding the comparison on the 2 implementations.

> 
> 1. Per-netns TCP-AO keys vs per-socket TCP-AO keys. The reasons why this
> proposal is using per-socket keys (that are added like TCP-MD5 keys with
> setsockopt()) are:
> - They scale better: you don't have to lookup in netns database for a
> key. This is a major thing for Arista: we have to support customers that
> want more than 1000 peers with possible multiple keys per-peer. This
> scales well when the keys are split by design for each socket on every
> established connection.
> - TCP-AO doesn't require CAP_NET_ADMIN for usage.
> - TCP-AO is not meant to be transparent (like ipsec tunnels) for
> applications. The users are BGP applications which already know what
> they need.
> - Leonard's proposal has weird semantics when setsockopt() on some
> socket will change keys on other sockets in that network namespace. It
> should have been rather netlink-managed API or something of the kind if
> the keys are per-netns.
> 
> 2. This proposal seeks to do less in kernel space and leave more
> decision-making to the userspace. It is another major disagreement with

I do agree that complexity should be in userspace.


> Leonard's proposal, which seeks to add a key lifetime, key rotation
> logic and all other business-logic details into the kernel, while here
> those decisions are left for the userspace.
> If I understood Leonard correctly, he is placing more things in kernel
> to simplify migration for user applications from TCP-MD5 to TCP-AO. I
> rather think that would be a job for a shared library if that's needed.
> As per my perception (1) was also done to relieve userspace from the job
> of removing an outdated key simultaneously from all users in netns,
> while in this proposal this job is left for userspace to use available
> IPC methods. Essentially, I think TCP-AO in kernel should do only
> minimum that can't be done "reasonably" in userspace. By "reasonably" I
> mean without moving the TCP-IP stack into userspace.
> 
> 3. Re-using existing kernel code vs copy'n'pasting it, leaving
> refactoring for later. I'm a big fan of reusing existing functions. I
> think lesser amount of code in the end reduces the burden of maintenance
> as well as simplifies the code (both reading and changing). I can see
> Leonard's point of simplifying backports to stable releases that he
> ships to customers, but I think any upstream proposal should add less
> code and try reusing more.
> 
> 4. Following RFC5925 closer to text. RFC says that rnext_key from the
> peer MUST be respected, as well as that current_key MUST not be removed
> on an established connection. In this proposal if the requirements of
> RFC can be met, they are followed, rather than broken.
> 
> 5. Using ahash instead of shash. If there's a hardware accelerator - why
> not using it? This proposal uses crypto_ahash through per-CPU pool of
> crypto requests (crypto_pool).
> 
> 6. Hash algorithm UAPI: magic constants vs hash name as char *. This is
> a thing I've asked Leonard multiple times and what he refuses to change
> in his patches: let the UAPI have `char tcpa_alg_name[64]' and just pass
> it to crypto_* layer. There is no need for #define MY_HASHING_ALGO 0x2
> and another in-kernel array to convert the magic number to algorithm
> string in order to pass it to crypto.
> The algorithm names are flexible: we already have customer's request to
> use other than RFC5926 required hashing algorithms. And I don't see any
> value in this middle-layer. This is already what kernel does, see for
> example, include/uapi/linux/xfrm.h, grep for alg_name.
> 
> 7. Adding traffic keys from the beginning. The proposal would be
> incomplete without having traffic keys: they are pre-calculated in this
> proposal, so the TCP stack doesn't have to do hashing twice (first for
> calculation of the traffic key) for every segment on established
> connections. This proposal has them naturally per-socket.
> 
> I think those are the biggest differences in the approaches and they are
> enough to submit a concurrent proposal. Salam, Francesco, please add if
> I've missed any other disagreement or major architectural/design
> difference in the proposals.
> 
>> In TODO (expect in next versions):
>> - selftests on older kernels (or with CONFIG_TCP_AO=n) should exit with
>>   SKIP, not FAIL
>> - Support VRFs in setsockopt()
>> - setsockopt() UAPI padding + a test that structures are of the same
>>   size on 32-bit as on 64-bit platforms
>> - IPv4-mapped-IPv6 addresses (might be working, but no selftest now)
>> - Remove CONFIG_TCP_AO dependency on CONFIG_TCP_MD5SIG
>> - Add TCP-AO static key
>> - Measure/benchmark TCP-AO and regular TCP connections
>> - setsockopt(TCP_REPAIR) with TCP-AO
> [..]
> [1]:
> https://lore.kernel.org/linux-crypto/cover.1662361354.git.cdleonard@gmail.com/
> [2]:
> https://lore.kernel.org/all/8097c38e-e88e-66ad-74d3-2f4a9e3734f4@arista.com/T/#u
> 
> Thanks,
>           Dmitry