From patchwork Wed Jul 27 06:09:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martin KaFai Lau X-Patchwork-Id: 12930097 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12A4FC19F2B for ; Wed, 27 Jul 2022 06:10:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230313AbiG0GK3 (ORCPT ); Wed, 27 Jul 2022 02:10:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60858 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231361AbiG0GKN (ORCPT ); Wed, 27 Jul 2022 02:10:13 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5FA84262D for ; Tue, 26 Jul 2022 23:10:10 -0700 (PDT) Received: from pps.filterd (m0109334.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 26QNDEBT019357 for ; Tue, 26 Jul 2022 23:10:10 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=GMZDiunj/oFdXyWbHzglPsZfKubquR6VD5UPdIw9Gdc=; b=B52LRS9+pJIeR0FNvBQyFHt/DY5gT/5ZL3RRd6qTLBtBy6G3yic3YcubJRUD74JBC54N VLut4D0PBwtohPOmG689sOdeu0rigjvBSiR19Ll+HK0lhv1ONHRMxZrHWbGNZGYS8Mau LNFakdT7JCiKlOid7XRvUcQO3yOr5BjQAjo= Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3hhxbwutme-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 26 Jul 2022 23:10:09 -0700 Received: from snc-exhub201.TheFacebook.com (2620:10d:c085:21d::7) by snc-exhub104.TheFacebook.com (2620:10d:c085:11d::4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.28; Tue, 26 Jul 2022 23:10:09 -0700 Received: from twshared22413.18.frc3.facebook.com (2620:10d:c085:208::f) by mail.thefacebook.com (2620:10d:c085:21d::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.28; Tue, 26 Jul 2022 23:10:09 -0700 Received: by devbig933.frc1.facebook.com (Postfix, from userid 6611) id CA7C7757CE1E; Tue, 26 Jul 2022 23:09:59 -0700 (PDT) From: Martin KaFai Lau To: , CC: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , David Miller , Eric Dumazet , Jakub Kicinski , , Paolo Abeni Subject: [PATCH bpf-next 10/14] bpf: Change bpf_setsockopt(SOL_TCP) to reuse do_tcp_setsockopt() Date: Tue, 26 Jul 2022 23:09:59 -0700 Message-ID: <20220727060959.2378252-1-kafai@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220727060856.2370358-1-kafai@fb.com> References: <20220727060856.2370358-1-kafai@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: vI3B7vRAXfU3dVcghk8j7tLL3RkzYnnT X-Proofpoint-ORIG-GUID: vI3B7vRAXfU3dVcghk8j7tLL3RkzYnnT X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.883,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-07-26_07,2022-07-26_01,2022-06-22_01 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net After the prep work in the previous patches, this patch removes all the dup code from bpf_setsockopt(SOL_TCP) and reuses the do_tcp_setsockopt(). The existing optname white-list is refactored into a new function sol_tcp_setsockopt(). The sol_tcp_setsockopt() also calls the bpf_sol_tcp_setsockopt() to handle the TCP_BPF_XXX specific optnames. bpf_setsockopt(TCP_SAVE_SYN) now also allows a value 2 to save the eth header also and it comes for free from do_tcp_setsockopt(). Signed-off-by: Martin KaFai Lau --- include/net/tcp.h | 2 + net/core/filter.c | 97 +++++++++++++++-------------------------------- net/ipv4/tcp.c | 2 +- 3 files changed, 33 insertions(+), 68 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index f9e7c85ea829..06b63a807c33 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -405,6 +405,8 @@ __poll_t tcp_poll(struct file *file, struct socket *sock, int tcp_getsockopt(struct sock *sk, int level, int optname, char __user *optval, int __user *optlen); bool tcp_bpf_bypass_getsockopt(int level, int optname); +int do_tcp_setsockopt(struct sock *sk, int level, int optname, + sockptr_t optval, unsigned int optlen); int tcp_setsockopt(struct sock *sk, int level, int optname, sockptr_t optval, unsigned int optlen); void tcp_set_keepalive(struct sock *sk, int val); diff --git a/net/core/filter.c b/net/core/filter.c index 8dd195b9b860..97aed6575810 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -5095,6 +5095,34 @@ static int bpf_sol_tcp_setsockopt(struct sock *sk, int optname, return 0; } +static int sol_tcp_setsockopt(struct sock *sk, int optname, + char *optval, int optlen) +{ + if (sk->sk_prot->setsockopt != tcp_setsockopt) + return -EINVAL; + + switch (optname) { + case TCP_KEEPIDLE: + case TCP_KEEPINTVL: + case TCP_KEEPCNT: + case TCP_SYNCNT: + case TCP_WINDOW_CLAMP: + case TCP_USER_TIMEOUT: + case TCP_NOTSENT_LOWAT: + case TCP_SAVE_SYN: + if (optlen != sizeof(int)) + return -EINVAL; + break; + case TCP_CONGESTION: + break; + default: + return bpf_sol_tcp_setsockopt(sk, optname, optval, optlen); + } + + return do_tcp_setsockopt(sk, SOL_TCP, optname, + KERNEL_SOCKPTR_BPF(optval), optlen); +} + static int __bpf_setsockopt(struct sock *sk, int level, int optname, char *optval, int optlen) { @@ -5147,73 +5175,8 @@ static int __bpf_setsockopt(struct sock *sk, int level, int optname, default: ret = -EINVAL; } - } else if (IS_ENABLED(CONFIG_INET) && level == SOL_TCP && - sk->sk_prot->setsockopt == tcp_setsockopt) { - if (optname >= TCP_BPF_IW) - return bpf_sol_tcp_setsockopt(sk, optname, - optval, optlen); - - if (optname == TCP_CONGESTION) { - char name[TCP_CA_NAME_MAX]; - - strncpy(name, optval, min_t(long, optlen, - TCP_CA_NAME_MAX-1)); - name[TCP_CA_NAME_MAX-1] = 0; - ret = tcp_set_congestion_control(sk, name, false, true); - } else { - struct inet_connection_sock *icsk = inet_csk(sk); - struct tcp_sock *tp = tcp_sk(sk); - - if (optlen != sizeof(int)) - return -EINVAL; - - val = *((int *)optval); - /* Only some options are supported */ - switch (optname) { - case TCP_SAVE_SYN: - if (val < 0 || val > 1) - ret = -EINVAL; - else - tp->save_syn = val; - break; - case TCP_KEEPIDLE: - ret = tcp_sock_set_keepidle_locked(sk, val); - break; - case TCP_KEEPINTVL: - if (val < 1 || val > MAX_TCP_KEEPINTVL) - ret = -EINVAL; - else - tp->keepalive_intvl = val * HZ; - break; - case TCP_KEEPCNT: - if (val < 1 || val > MAX_TCP_KEEPCNT) - ret = -EINVAL; - else - tp->keepalive_probes = val; - break; - case TCP_SYNCNT: - if (val < 1 || val > MAX_TCP_SYNCNT) - ret = -EINVAL; - else - icsk->icsk_syn_retries = val; - break; - case TCP_USER_TIMEOUT: - if (val < 0) - ret = -EINVAL; - else - icsk->icsk_user_timeout = val; - break; - case TCP_NOTSENT_LOWAT: - tp->notsent_lowat = val; - sk->sk_write_space(sk); - break; - case TCP_WINDOW_CLAMP: - ret = tcp_set_window_clamp(sk, val); - break; - default: - ret = -EINVAL; - } - } + } else if (IS_ENABLED(CONFIG_INET) && level == SOL_TCP) { + return sol_tcp_setsockopt(sk, optname, optval, optlen); } else { ret = -EINVAL; } diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 7f8d81befa8e..5a327a0e1af9 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3439,7 +3439,7 @@ int tcp_set_window_clamp(struct sock *sk, int val) /* * Socket option code for TCP. */ -static int do_tcp_setsockopt(struct sock *sk, int level, int optname, +int do_tcp_setsockopt(struct sock *sk, int level, int optname, sockptr_t optval, unsigned int optlen) { struct tcp_sock *tp = tcp_sk(sk);