mbox series

[bpf-next,v2,0/2] prevent bpf_reserve_hdr_opt() from growing skb larger than MTU

Message ID 20240827013736.2845596-1-zijianzhang@bytedance.com (mailing list archive)
Headers show
Series prevent bpf_reserve_hdr_opt() from growing skb larger than MTU | expand

Message

Zijian Zhang Aug. 27, 2024, 1:37 a.m. UTC
From: Amery Hung <amery.hung@bytedance.com>

This series prevents sockops users from accidentally causing packet
drops. This can happen when a BPF_SOCK_OPS_HDR_OPT_LEN_CB program
reserves different option lengths in tcp_sendmsg().

Initially, sockops BPF_SOCK_OPS_HDR_OPT_LEN_CB program will be called to
reserve a space in tcp_send_mss(), which will return the MSS for TSO.
Then, BPF_SOCK_OPS_HDR_OPT_LEN_CB will be called in __tcp_transmit_skb()
again to calculate the actual tcp_option_size and skb_push() the total
header size.

skb->gso_size is restored from TCP_SKB_CB(skb)->tcp_gso_size, which is
derived from tcp_send_mss() where we first call HDR_OPT_LEN. If the
reserved opt size is smaller than the actual header size, the len of the
skb can exceed the MTU. As a result, ip(6)_fragment will drop the
packet if skb->ignore_df is not set.

To prevent this accidental packet drop, we need to make sure the
second call to the BPF_SOCK_OPS_HDR_OPT_LEN_CB program reserves space
not more than the first time. Since this cannot be done during
verification time, we add a runtime sanity check to have
bpf_reserve_hdr_opt return an error instead of causing packet drops later.

We also add a selftests to verify the sanity check. If users accidentally
reserve a small size, bpf_reserve_hdr_opt() should return an appropriate
error value and no packet should be dropped.

Changelog:
  v1 -> v2:
    - I accidentally missed the eBPF prog file in the previous patch
    submission, sorry for the convenience.

Amery Hung (1):
  bpf: tcp: prevent bpf_reserve_hdr_opt() from growing skb larger than
    MTU

Zijian Zhang (1):
  bpf: selftests: reserve smaller tcp header options than the actual
    size

 include/net/tcp.h                             |  8 +++
 net/ipv4/tcp_input.c                          |  8 ---
 net/ipv4/tcp_output.c                         | 13 +++-
 .../bpf/prog_tests/tcp_hdr_options.c          | 51 +++++++++++++
 .../bpf/progs/test_reserve_tcp_hdr_options.c  | 71 +++++++++++++++++++
 5 files changed, 141 insertions(+), 10 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/test_reserve_tcp_hdr_options.c