From patchwork Tue Apr 18 15:31:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aditi Ghag X-Patchwork-Id: 13215820 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB6CDC77B7E for ; Tue, 18 Apr 2023 15:32:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229849AbjDRPcE (ORCPT ); Tue, 18 Apr 2023 11:32:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47652 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230489AbjDRPcD (ORCPT ); Tue, 18 Apr 2023 11:32:03 -0400 Received: from mail-pl1-x62a.google.com (mail-pl1-x62a.google.com [IPv6:2607:f8b0:4864:20::62a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 27477F9 for ; Tue, 18 Apr 2023 08:32:00 -0700 (PDT) Received: by mail-pl1-x62a.google.com with SMTP id d9443c01a7336-1a667067275so17993085ad.1 for ; Tue, 18 Apr 2023 08:32:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1681831919; x=1684423919; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=dsYgtyatiuEOqLnBne2bDMKGwMEmj6cAm3wGjunMDm4=; b=IgcKbO0F3bzCFJtPpStc9tqZpqIvbbNQSzJu5wwgsrO7Re1ijcMiILnmNWbpijDidn cyG9hum43x62GWAdZ2b4dsvQPa9lVQj5KVwgiVn9iNw4+hUwns9Z0zbVpmWoOp4Bcmsh w49pIhs9AVBxs1+BsT0C5sdvf0Ch+Njk78pPdPj2vmxTQEAt7vwSE/K3pmZiHqQGqp7W kyQjGBNGkJ6UuHQVvmmX5HEMFqi3OboapPxo37I6tBkuotCGQicGgYGOAyp/slDq1p0E 4AD3oX5wcoGK0tARhw5eZet4Y9sKDbulb7pzQPqTKJrw8NzXrxbwW87AKDuGvhooj9+7 3gtQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681831919; x=1684423919; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dsYgtyatiuEOqLnBne2bDMKGwMEmj6cAm3wGjunMDm4=; b=fYdpG+oCLkByS3nc05kHck+MfY82gOTxbuk0tybtrTyERnVmLGV5J+naCT8AOL8NtN +xygKcg1F2A6e6T4AWMhdDn0fWwNsUU8zgNyh3SMufpH2ECTIdTcWfxNYAo0TbeMvMsp L2esYCav9c46iVnjXR7KTPpz0njJqyFYyIx8W2PDl7pvpzsVmX4Pn7WVfi0WqJoxTMu7 pyBG5S8tSe3aC4D0P6UAXtcfP3YEPVYJaJFjr8oB2J0Lr0WTpuKeBVoxLPQK+7cNgUAa hALglXLAVQep9vjQWsYifXSZYLia4YRYvPWWYGw4ScL/fW3DS0EdszmYAvkDjyAMrRwk GsXg== X-Gm-Message-State: AAQBX9esaten1iWBwVX03rni67HSAdmAgLg6hGd7x2vNw2qA89GqQBIP lOqIt7aDWfV2CSkOpZO6p3Y4u2tTKEDlIvx96iM= X-Google-Smtp-Source: AKy350bVeua70D1NdAuCZgIaJ0zaitXDh5GElqnwFysk7bEpkUfgMreUjAvPAstAbTXPzMEae0Rfhw== X-Received: by 2002:a17:903:1cf:b0:1a0:50bd:31bf with SMTP id e15-20020a17090301cf00b001a050bd31bfmr2766911plh.32.1681831919324; Tue, 18 Apr 2023 08:31:59 -0700 (PDT) Received: from localhost.localdomain ([2604:1380:4611:8100::1]) by smtp.gmail.com with ESMTPSA id ba4-20020a170902720400b001a647709864sm9769630plb.155.2023.04.18.08.31.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Apr 2023 08:31:58 -0700 (PDT) From: Aditi Ghag To: bpf@vger.kernel.org Cc: kafai@fb.com, sdf@google.com, edumazet@google.com, aditi.ghag@isovalent.com Subject: [PATCH 1/7] bpf: tcp: Avoid taking fast sock lock in iterator Date: Tue, 18 Apr 2023 15:31:42 +0000 Message-Id: <20230418153148.2231644-2-aditi.ghag@isovalent.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230418153148.2231644-1-aditi.ghag@isovalent.com> References: <20230418153148.2231644-1-aditi.ghag@isovalent.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org Previously, BPF TCP iterator was acquiring fast version of sock lock that disables the BH. This introduced a circular dependency with code paths that later acquire sockets hash table bucket lock. Replace the fast version of sock lock with slow that faciliates BPF programs executed from the iterator to destroy TCP listening sockets using the bpf_sock_destroy kfunc (implemened in follow-up commits). Here is a stack trace that motivated this change: ``` 1) sock_lock with BH disabled + bucket lock lock_acquire+0xcd/0x330 _raw_spin_lock_bh+0x38/0x50 inet_unhash+0x96/0xd0 tcp_set_state+0x6a/0x210 tcp_abort+0x12b/0x230 bpf_prog_f4110fb1100e26b5_iter_tcp6_server+0xa3/0xaa bpf_iter_run_prog+0x1ff/0x340 bpf_iter_tcp_seq_show+0xca/0x190 bpf_seq_read+0x177/0x450 vfs_read+0xc6/0x300 ksys_read+0x69/0xf0 do_syscall_64+0x3c/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc 2) sock lock with BH enable [ 1.499968] lock_acquire+0xcd/0x330 [ 1.500316] _raw_spin_lock+0x33/0x40 [ 1.500670] sk_clone_lock+0x146/0x520 [ 1.501030] inet_csk_clone_lock+0x1b/0x110 [ 1.501433] tcp_create_openreq_child+0x22/0x3f0 [ 1.501873] tcp_v6_syn_recv_sock+0x96/0x940 [ 1.502284] tcp_check_req+0x137/0x660 [ 1.502646] tcp_v6_rcv+0xa63/0xe80 [ 1.502994] ip6_protocol_deliver_rcu+0x78/0x590 [ 1.503434] ip6_input_finish+0x72/0x140 [ 1.503818] __netif_receive_skb_one_core+0x63/0xa0 [ 1.504281] process_backlog+0x79/0x260 [ 1.504668] __napi_poll.constprop.0+0x27/0x170 [ 1.505104] net_rx_action+0x14a/0x2a0 [ 1.505469] __do_softirq+0x165/0x510 [ 1.505842] do_softirq+0xcd/0x100 [ 1.506172] __local_bh_enable_ip+0xcc/0xf0 [ 1.506588] ip6_finish_output2+0x2a8/0xb00 [ 1.506988] ip6_finish_output+0x274/0x510 [ 1.507377] ip6_xmit+0x319/0x9b0 [ 1.507726] inet6_csk_xmit+0x12b/0x2b0 [ 1.508096] __tcp_transmit_skb+0x549/0xc40 [ 1.508498] tcp_rcv_state_process+0x362/0x1180 ``` Acked-by: Stanislav Fomichev Signed-off-by: Aditi Ghag --- net/ipv4/tcp_ipv4.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index ea370afa70ed..f2d370a9450f 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -2962,7 +2962,6 @@ static int bpf_iter_tcp_seq_show(struct seq_file *seq, void *v) struct bpf_iter_meta meta; struct bpf_prog *prog; struct sock *sk = v; - bool slow; uid_t uid; int ret; @@ -2970,7 +2969,7 @@ static int bpf_iter_tcp_seq_show(struct seq_file *seq, void *v) return 0; if (sk_fullsock(sk)) - slow = lock_sock_fast(sk); + lock_sock(sk); if (unlikely(sk_unhashed(sk))) { ret = SEQ_SKIP; @@ -2994,7 +2993,7 @@ static int bpf_iter_tcp_seq_show(struct seq_file *seq, void *v) unlock: if (sk_fullsock(sk)) - unlock_sock_fast(sk, slow); + release_sock(sk); return ret; } From patchwork Tue Apr 18 15:31:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aditi Ghag X-Patchwork-Id: 13215822 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F2F3C6FD18 for ; Tue, 18 Apr 2023 15:32:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230489AbjDRPcF (ORCPT ); Tue, 18 Apr 2023 11:32:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47654 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229978AbjDRPcD (ORCPT ); Tue, 18 Apr 2023 11:32:03 -0400 Received: from mail-pl1-x62f.google.com (mail-pl1-x62f.google.com [IPv6:2607:f8b0:4864:20::62f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 292ECE60 for ; Tue, 18 Apr 2023 08:32:01 -0700 (PDT) Received: by mail-pl1-x62f.google.com with SMTP id d9443c01a7336-1a6670671e3so19297075ad.0 for ; Tue, 18 Apr 2023 08:32:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1681831920; x=1684423920; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=MUqvsQSwIrgdlIXUPleFIEFJtWEpRJgT9Zrnu0X6Er0=; b=RW1DyetObtW4pDXVWkzzaDts3DFVkqsbpWXAcSLxG39TGlQ7DRZTYzwJzIwAn26wU0 6uOqZm8maQ1FmRgcGKvRYvFoiCpVxp5kblso9zOi4yvcbxDAVvlQlU83ZIdtrEaDxR6K fTm6nHpiCcR068rhM5DGvkif1E4ywlJVfSB4Ri6vDCdFs33sXNF2VO7KwQsYIEiTigK1 Xlo+UqNzWwEe9L0r5a67NhTSVlPMnxWyS0CyLTmkown1EAII/pZq0yHo3+wZ/DTL5J8A pDwpRymyldgA3yyka2bKobLoGzW4QbCyYf6vG0r3lVCW8rhwFO9hd1xbi/CIXrwO5PgW oppA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681831920; x=1684423920; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MUqvsQSwIrgdlIXUPleFIEFJtWEpRJgT9Zrnu0X6Er0=; b=kMXBgFBWuOt+S1iBT4pOOEsgj1MS1DLdSu0RSeajB1FWVhAHpqBX4i8MoDyTnzQMrs 0/8jUGPOHEG7ppSiyY2gWSjaDdZoRqlV883hWjv+o+tg/hx3iHGTct2+opEuPbXpMDdO yvS+w3aQmBPxxb+dTFteXSdxckyg2VSc2RJj5Dtz7w2omoRKow1KlxS11wggPc3r51iB Icr5Vfm8BXLsoA9GtJ1BZWLZiYdCPU+Xq6F48eXrLTqf13L7KSEMlEO3eu6S4q0FskO/ hu3ohhsa3oSM5ZJmNpl6QGWh1rJeGLEbha0CBwkl5m/iW8FJZCMV3wiiCjrvyJhqdyv4 yNyw== X-Gm-Message-State: AAQBX9cTDz6s4Qv95xVlC418YE4+RBdXQ3+AtheVdvoNY3C13cw/VNJT LUVbBqC33RgN5wzAkVFPmjPetHJfzcg0GQZ5m/E= X-Google-Smtp-Source: AKy350a1EHyblUB/s1bHM9EFvt+fHwI9hmiNr2tpXYAq53jDjYu7LBWUcT4ITH+qAmF5wDUEo4CI1A== X-Received: by 2002:a17:902:d2c6:b0:1a6:45f7:b332 with SMTP id n6-20020a170902d2c600b001a645f7b332mr2786421plc.63.1681831920301; Tue, 18 Apr 2023 08:32:00 -0700 (PDT) Received: from localhost.localdomain ([2604:1380:4611:8100::1]) by smtp.gmail.com with ESMTPSA id ba4-20020a170902720400b001a647709864sm9769630plb.155.2023.04.18.08.31.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Apr 2023 08:31:59 -0700 (PDT) From: Aditi Ghag To: bpf@vger.kernel.org Cc: kafai@fb.com, sdf@google.com, edumazet@google.com, aditi.ghag@isovalent.com, Martin KaFai Lau Subject: [PATCH 2/7] udp: seq_file: Remove bpf_seq_afinfo from udp_iter_state Date: Tue, 18 Apr 2023 15:31:43 +0000 Message-Id: <20230418153148.2231644-3-aditi.ghag@isovalent.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230418153148.2231644-1-aditi.ghag@isovalent.com> References: <20230418153148.2231644-1-aditi.ghag@isovalent.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org This is a preparatory commit to remove the field. The field was previously shared between proc fs and BPF UDP socket iterators. As the follow-up commits will decouple the implementation for the iterators, remove the field. As for BPF socket iterator, filtering of sockets is exepected to be done in BPF programs. Suggested-by: Martin KaFai Lau Signed-off-by: Aditi Ghag --- include/net/udp.h | 1 - net/ipv4/udp.c | 34 ++++------------------------------ 2 files changed, 4 insertions(+), 31 deletions(-) diff --git a/include/net/udp.h b/include/net/udp.h index de4b528522bb..5cad44318d71 100644 --- a/include/net/udp.h +++ b/include/net/udp.h @@ -437,7 +437,6 @@ struct udp_seq_afinfo { struct udp_iter_state { struct seq_net_private p; int bucket; - struct udp_seq_afinfo *bpf_seq_afinfo; }; void *udp_seq_start(struct seq_file *seq, loff_t *pos); diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index c605d171eb2d..3c9eeee28678 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -2997,10 +2997,7 @@ static struct sock *udp_get_first(struct seq_file *seq, int start) struct udp_table *udptable; struct sock *sk; - if (state->bpf_seq_afinfo) - afinfo = state->bpf_seq_afinfo; - else - afinfo = pde_data(file_inode(seq->file)); + afinfo = pde_data(file_inode(seq->file)); udptable = udp_get_table_afinfo(afinfo, net); @@ -3033,10 +3030,7 @@ static struct sock *udp_get_next(struct seq_file *seq, struct sock *sk) struct udp_seq_afinfo *afinfo; struct udp_table *udptable; - if (state->bpf_seq_afinfo) - afinfo = state->bpf_seq_afinfo; - else - afinfo = pde_data(file_inode(seq->file)); + afinfo = pde_data(file_inode(seq->file)); do { sk = sk_next(sk); @@ -3094,10 +3088,7 @@ void udp_seq_stop(struct seq_file *seq, void *v) struct udp_seq_afinfo *afinfo; struct udp_table *udptable; - if (state->bpf_seq_afinfo) - afinfo = state->bpf_seq_afinfo; - else - afinfo = pde_data(file_inode(seq->file)); + afinfo = pde_data(file_inode(seq->file)); udptable = udp_get_table_afinfo(afinfo, seq_file_net(seq)); @@ -3415,28 +3406,11 @@ DEFINE_BPF_ITER_FUNC(udp, struct bpf_iter_meta *meta, static int bpf_iter_init_udp(void *priv_data, struct bpf_iter_aux_info *aux) { - struct udp_iter_state *st = priv_data; - struct udp_seq_afinfo *afinfo; - int ret; - - afinfo = kmalloc(sizeof(*afinfo), GFP_USER | __GFP_NOWARN); - if (!afinfo) - return -ENOMEM; - - afinfo->family = AF_UNSPEC; - afinfo->udp_table = NULL; - st->bpf_seq_afinfo = afinfo; - ret = bpf_iter_init_seq_net(priv_data, aux); - if (ret) - kfree(afinfo); - return ret; + return bpf_iter_init_seq_net(priv_data, aux); } static void bpf_iter_fini_udp(void *priv_data) { - struct udp_iter_state *st = priv_data; - - kfree(st->bpf_seq_afinfo); bpf_iter_fini_seq_net(priv_data); } From patchwork Tue Apr 18 15:31:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aditi Ghag X-Patchwork-Id: 13215823 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9831C77B75 for ; Tue, 18 Apr 2023 15:32:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229978AbjDRPcG (ORCPT ); Tue, 18 Apr 2023 11:32:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47656 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231265AbjDRPcD (ORCPT ); Tue, 18 Apr 2023 11:32:03 -0400 Received: from mail-pl1-x633.google.com (mail-pl1-x633.google.com [IPv6:2607:f8b0:4864:20::633]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D3BBCE6A for ; Tue, 18 Apr 2023 08:32:01 -0700 (PDT) Received: by mail-pl1-x633.google.com with SMTP id n17so12006651pln.8 for ; Tue, 18 Apr 2023 08:32:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1681831921; x=1684423921; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=H5xBGSQzsY3CbPFJoYnSBy33rsY0QyNtxY6oZiS3oGM=; b=XUKiSs/nbhQFXc+Ow29S7Ep76DFld8+Vk/c23lbZu9P4ThWX9ILWOmScx/fSNKO5Km ZIT2vwsa/lWKg2b9/MbZ4aZwg11XFQKdBUiwBaxtov8xEPNn4hV7MgaJ9lgGf3acxiff bXbPMbrdnELOOT33ltcFaRBbXeTkG5cVEuMffKuW6hv12+3UN1+KfUj1hwJVI152nxmJ wOqJSX9PaLcbsyXw2uyFZwpunon/HhqjXhFQPogHRyTeQQvS/fUk37Mqr+FV1UPB3BZ4 PAAhUOmcJEfuiDps6zZ0R5CA5iV5raQNJmPAvsAjJsd9R6jeJ80ZW4t5W8bBF2nwSGUW WPhQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681831921; x=1684423921; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=H5xBGSQzsY3CbPFJoYnSBy33rsY0QyNtxY6oZiS3oGM=; b=l/6r8is1NTORWVcyci4qSjA5pNsV5GP+5qMUwuAO0V8XPCCV3One4taYO0t2ckBAGY JSZYSq5oKjqP7qTC8+k0oQiXfmepqLu4AYShlnrkjd7EPPtmIDXb/OOkvh4M7xpekYdj XKFBuT+qGTwp+Xo7uWWeAzPJ0H06brg9B2WdQ/nwLLsKEqSJsHIEZO+HAFbn4k1wfsWV JO9DAH95por315Z10XjqgfKMjm2OOpeK+mFKr+idZvt4hxUcoLygNbZ2OV41i9J3g+dn TZWoA7Yps5/gsWXOnFNI8Zl+bH9vGPISiTZzJWhdb+puu77WkeMIlRsrUTjZvsHuM4iR Lgbg== X-Gm-Message-State: AAQBX9ce53m5WjT2ya48q4gIH619BOusnkcQvcNJGE3AaSzjBEeYBCwI 0wllin07NQW7d2NRSHDij2IzU2d0wD/5Idi9ALM= X-Google-Smtp-Source: AKy350ZSDaYg465mQO56H+S5nmpTdJoIw5JTNi0BQwRT65oh69WYH4PU2Qa5DpKblv689RNexOh1yw== X-Received: by 2002:a05:6a20:8412:b0:ee:58da:4e4c with SMTP id c18-20020a056a20841200b000ee58da4e4cmr325093pzd.1.1681831921109; Tue, 18 Apr 2023 08:32:01 -0700 (PDT) Received: from localhost.localdomain ([2604:1380:4611:8100::1]) by smtp.gmail.com with ESMTPSA id ba4-20020a170902720400b001a647709864sm9769630plb.155.2023.04.18.08.32.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Apr 2023 08:32:00 -0700 (PDT) From: Aditi Ghag To: bpf@vger.kernel.org Cc: kafai@fb.com, sdf@google.com, edumazet@google.com, aditi.ghag@isovalent.com Subject: [PATCH 3/7] udp: seq_file: Helper function to match socket attributes Date: Tue, 18 Apr 2023 15:31:44 +0000 Message-Id: <20230418153148.2231644-4-aditi.ghag@isovalent.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230418153148.2231644-1-aditi.ghag@isovalent.com> References: <20230418153148.2231644-1-aditi.ghag@isovalent.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org This is a preparatory commit to refactor code that matches socket attributes in iterators to a helper function, and use it in the proc fs iterator. Signed-off-by: Aditi Ghag --- net/ipv4/udp.c | 34 +++++++++++++++++++++++++++------- 1 file changed, 27 insertions(+), 7 deletions(-) diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 3c9eeee28678..8689ed171776 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -2983,6 +2983,16 @@ EXPORT_SYMBOL(udp_prot); /* ------------------------------------------------------------------------ */ #ifdef CONFIG_PROC_FS +static unsigned short seq_file_family(const struct seq_file *seq); +static bool seq_sk_match(struct seq_file *seq, const struct sock *sk) +{ + unsigned short family = seq_file_family(seq); + + /* AF_UNSPEC is used as a match all */ + return ((family == AF_UNSPEC || family == sk->sk_family) && + net_eq(sock_net(sk), seq_file_net(seq))); +} + static struct udp_table *udp_get_table_afinfo(struct udp_seq_afinfo *afinfo, struct net *net) { @@ -3010,10 +3020,7 @@ static struct sock *udp_get_first(struct seq_file *seq, int start) spin_lock_bh(&hslot->lock); sk_for_each(sk, &hslot->head) { - if (!net_eq(sock_net(sk), net)) - continue; - if (afinfo->family == AF_UNSPEC || - sk->sk_family == afinfo->family) + if (seq_sk_match(seq, sk)) goto found; } spin_unlock_bh(&hslot->lock); @@ -3034,9 +3041,7 @@ static struct sock *udp_get_next(struct seq_file *seq, struct sock *sk) do { sk = sk_next(sk); - } while (sk && (!net_eq(sock_net(sk), net) || - (afinfo->family != AF_UNSPEC && - sk->sk_family != afinfo->family))); + } while (sk && !seq_sk_match(seq, sk)); if (!sk) { udptable = udp_get_table_afinfo(afinfo, net); @@ -3196,6 +3201,21 @@ static const struct seq_operations bpf_iter_udp_seq_ops = { }; #endif +static unsigned short seq_file_family(const struct seq_file *seq) +{ + const struct udp_seq_afinfo *afinfo; + +#ifdef CONFIG_BPF_SYSCALL + /* BPF iterator: bpf programs to filter sockets. */ + if (seq->op == &bpf_iter_udp_seq_ops) + return AF_UNSPEC; +#endif + + /* Proc fs iterator */ + afinfo = pde_data(file_inode(seq->file)); + return afinfo->family; +} + const struct seq_operations udp_seq_ops = { .start = udp_seq_start, .next = udp_seq_next, From patchwork Tue Apr 18 15:31:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aditi Ghag X-Patchwork-Id: 13215824 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 85D04C77B7E for ; Tue, 18 Apr 2023 15:32:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232295AbjDRPcG (ORCPT ); Tue, 18 Apr 2023 11:32:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47664 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232062AbjDRPcE (ORCPT ); Tue, 18 Apr 2023 11:32:04 -0400 Received: from mail-pj1-x102c.google.com (mail-pj1-x102c.google.com [IPv6:2607:f8b0:4864:20::102c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D2C6AE79 for ; Tue, 18 Apr 2023 08:32:02 -0700 (PDT) Received: by mail-pj1-x102c.google.com with SMTP id 98e67ed59e1d1-2470e93ea71so1476920a91.0 for ; Tue, 18 Apr 2023 08:32:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1681831922; x=1684423922; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=efU6hnJJgSZ1G+H3G+oVBU3mRSmbJcHbOU8ItivXOpU=; b=eGDyrROKHd87UsvY8ukv5fl861gHaAYH12ZzJPOmq2wiJd69FW0tW+sgn9j+btyakY JCZSKGZbVNKZU4VJyQmLEVft8WLRauVY4zjIGR0Owcw30Sg645J6VpQOA8joLNm4EI2i pVb27raOVse8DvZwfHVvY7h3fjmTeG58zGcY1UU2VFvJj5y2V6quJD5zkp8uzb5ROwaT pTuq0Nf8+wFCPDuWK5A1iK7r+NCFkowSmEaeXPQRUIDVGKJGWJZ1gBviuE0IAx0+WTVD KkXZSe/44/f4F3zlnikS+tI6FL6vde21dUEINu3CDRjjE3VhMKL5BXPIG/oeeS1AHuxM d9Ag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681831922; x=1684423922; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=efU6hnJJgSZ1G+H3G+oVBU3mRSmbJcHbOU8ItivXOpU=; b=ei5H1K7NLz+UeyKkf+K7I6XEiJn54gxd1vMDbrjwyxkFe9JxBEj2ZnU0U7KJtOsAxI /DJh4Vj6rHbwKbcFCfw/bORIAn8f/GE5EcrEUJ9x36O34zjAKnWuKkcNZ1zP7zOpVhNv cVMPOyLEvu7ixAX3FgkU7l6nUXhcvY5ThngSguTwLBxbVKDW3qvykOvZM7pPNVQnPyxV i1WWHUiIsYaoEMfYiLmVLsYQZmDBR9pYB5YTn7lx7Eq6i89W+vYmwPYtSmJ2OSaH1HQf 5KtTcFR7qheUUqvarFdmrJTuNo7t2X+Fqi+YgoBRJM29kdU3VjuA2wuAQPEs1A72pp+Q 0Ztw== X-Gm-Message-State: AAQBX9dyEPLxlFmhBFsVxOZQvdJoYoUlFlQnAZ69jDLP+D59nlRYR4FD FDBwM4L8XOqxypeifQjftzT+sDhxjC1OjW3nZ5U= X-Google-Smtp-Source: AKy350a04czOqn/VQjYVzug54qWjnpYNkdQ0KwDaY+1xkwoSKNRFrgkk6usyBhulKeJZWbcA7JBxlg== X-Received: by 2002:a17:902:ce0c:b0:1a1:a800:96a7 with SMTP id k12-20020a170902ce0c00b001a1a80096a7mr2548581plg.8.1681831921971; Tue, 18 Apr 2023 08:32:01 -0700 (PDT) Received: from localhost.localdomain ([2604:1380:4611:8100::1]) by smtp.gmail.com with ESMTPSA id ba4-20020a170902720400b001a647709864sm9769630plb.155.2023.04.18.08.32.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Apr 2023 08:32:01 -0700 (PDT) From: Aditi Ghag To: bpf@vger.kernel.org Cc: kafai@fb.com, sdf@google.com, edumazet@google.com, aditi.ghag@isovalent.com, Martin KaFai Lau Subject: [PATCH 4/7] bpf: udp: Implement batching for sockets iterator Date: Tue, 18 Apr 2023 15:31:45 +0000 Message-Id: <20230418153148.2231644-5-aditi.ghag@isovalent.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230418153148.2231644-1-aditi.ghag@isovalent.com> References: <20230418153148.2231644-1-aditi.ghag@isovalent.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org Batch UDP sockets from BPF iterator that allows for overlapping locking semantics in BPF/kernel helpers executed in BPF programs. This facilitates BPF socket destroy kfunc (introduced by follow-up patches) to execute from BPF iterator programs. Previously, BPF iterators acquired the sock lock and sockets hash table bucket lock while executing BPF programs. This prevented BPF helpers that again acquire these locks to be executed from BPF iterators. With the batching approach, we acquire a bucket lock, batch all the bucket sockets, and then release the bucket lock. This enables BPF or kernel helpers to skip sock locking when invoked in the supported BPF contexts. The batching logic is similar to the logic implemented in TCP iterator: https://lore.kernel.org/bpf/20210701200613.1036157-1-kafai@fb.com/. Suggested-by: Martin KaFai Lau Signed-off-by: Aditi Ghag --- net/ipv4/udp.c | 209 +++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 203 insertions(+), 6 deletions(-) diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 8689ed171776..f1c001641e53 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -3148,6 +3148,145 @@ struct bpf_iter__udp { int bucket __aligned(8); }; +struct bpf_udp_iter_state { + struct udp_iter_state state; + unsigned int cur_sk; + unsigned int end_sk; + unsigned int max_sk; + int offset; + struct sock **batch; + bool st_bucket_done; +}; + +static int bpf_iter_udp_realloc_batch(struct bpf_udp_iter_state *iter, + unsigned int new_batch_sz); +static struct sock *bpf_iter_udp_batch(struct seq_file *seq) +{ + struct bpf_udp_iter_state *iter = seq->private; + struct udp_iter_state *state = &iter->state; + struct net *net = seq_file_net(seq); + struct udp_seq_afinfo afinfo; + struct udp_table *udptable; + unsigned int batch_sks = 0; + bool resized = false; + struct sock *sk; + + /* The current batch is done, so advance the bucket. */ + if (iter->st_bucket_done) { + state->bucket++; + iter->offset = 0; + } + + afinfo.family = AF_UNSPEC; + afinfo.udp_table = NULL; + udptable = udp_get_table_afinfo(&afinfo, net); + +again: + /* New batch for the next bucket. + * Iterate over the hash table to find a bucket with sockets matching + * the iterator attributes, and return the first matching socket from + * the bucket. The remaining matched sockets from the bucket are batched + * before releasing the bucket lock. This allows BPF programs that are + * called in seq_show to acquire the bucket lock if needed. + */ + iter->cur_sk = 0; + iter->end_sk = 0; + iter->st_bucket_done = false; + batch_sks = 0; + + for (; state->bucket <= udptable->mask; state->bucket++) { + struct udp_hslot *hslot2 = &udptable->hash2[state->bucket]; + + if (hlist_empty(&hslot2->head)) { + iter->offset = 0; + continue; + } + + spin_lock_bh(&hslot2->lock); + udp_portaddr_for_each_entry(sk, &hslot2->head) { + if (seq_sk_match(seq, sk)) { + /* Resume from the last iterated socket at the + * offset in the bucket before iterator was stopped. + */ + if (iter->offset) { + --iter->offset; + continue; + } + if (iter->end_sk < iter->max_sk) { + sock_hold(sk); + iter->batch[iter->end_sk++] = sk; + } + batch_sks++; + } + } + spin_unlock_bh(&hslot2->lock); + + if (iter->end_sk) + break; + + /* Reset the current bucket's offset before moving to the next bucket. */ + iter->offset = 0; + } + + /* All done: no batch made. */ + if (!iter->end_sk) + return NULL; + + if (iter->end_sk == batch_sks) { + /* Batching is done for the current bucket; return the first + * socket to be iterated from the batch. + */ + iter->st_bucket_done = true; + goto ret; + } + if (!resized && !bpf_iter_udp_realloc_batch(iter, batch_sks * 3 / 2)) { + resized = true; + /* Go back to the previous bucket to resize its batch. */ + state->bucket--; + goto again; + } +ret: + return iter->batch[0]; +} + +static void *bpf_iter_udp_seq_next(struct seq_file *seq, void *v, loff_t *pos) +{ + struct bpf_udp_iter_state *iter = seq->private; + struct sock *sk; + + /* Whenever seq_next() is called, the iter->cur_sk is + * done with seq_show(), so unref the iter->cur_sk. + */ + if (iter->cur_sk < iter->end_sk) { + sock_put(iter->batch[iter->cur_sk++]); + ++iter->offset; + } + + /* After updating iter->cur_sk, check if there are more sockets + * available in the current bucket batch. + */ + if (iter->cur_sk < iter->end_sk) { + sk = iter->batch[iter->cur_sk]; + } else { + // Prepare a new batch. + sk = bpf_iter_udp_batch(seq); + } + + ++*pos; + return sk; +} + +static void *bpf_iter_udp_seq_start(struct seq_file *seq, loff_t *pos) +{ + /* bpf iter does not support lseek, so it always + * continue from where it was stop()-ped. + */ + if (*pos) + return bpf_iter_udp_batch(seq); + + return SEQ_START_TOKEN; +} + static int udp_prog_seq_show(struct bpf_prog *prog, struct bpf_iter_meta *meta, struct udp_sock *udp_sk, uid_t uid, int bucket) { @@ -3168,18 +3307,37 @@ static int bpf_iter_udp_seq_show(struct seq_file *seq, void *v) struct bpf_prog *prog; struct sock *sk = v; uid_t uid; + int rc; if (v == SEQ_START_TOKEN) return 0; + lock_sock(sk); + + if (unlikely(sk_unhashed(sk))) { + rc = SEQ_SKIP; + goto unlock; + } + uid = from_kuid_munged(seq_user_ns(seq), sock_i_uid(sk)); meta.seq = seq; prog = bpf_iter_get_info(&meta, false); - return udp_prog_seq_show(prog, &meta, v, uid, state->bucket); + rc = udp_prog_seq_show(prog, &meta, v, uid, state->bucket); + +unlock: + release_sock(sk); + return rc; +} + +static void bpf_iter_udp_put_batch(struct bpf_udp_iter_state *iter) +{ + while (iter->cur_sk < iter->end_sk) + sock_put(iter->batch[iter->cur_sk++]); } static void bpf_iter_udp_seq_stop(struct seq_file *seq, void *v) { + struct bpf_udp_iter_state *iter = seq->private; struct bpf_iter_meta meta; struct bpf_prog *prog; @@ -3190,12 +3348,15 @@ static void bpf_iter_udp_seq_stop(struct seq_file *seq, void *v) (void)udp_prog_seq_show(prog, &meta, v, 0, 0); } - udp_seq_stop(seq, v); + if (iter->cur_sk < iter->end_sk) { + bpf_iter_udp_put_batch(iter); + iter->st_bucket_done = false; + } } static const struct seq_operations bpf_iter_udp_seq_ops = { - .start = udp_seq_start, - .next = udp_seq_next, + .start = bpf_iter_udp_seq_start, + .next = bpf_iter_udp_seq_next, .stop = bpf_iter_udp_seq_stop, .show = bpf_iter_udp_seq_show, }; @@ -3424,21 +3585,57 @@ static struct pernet_operations __net_initdata udp_sysctl_ops = { DEFINE_BPF_ITER_FUNC(udp, struct bpf_iter_meta *meta, struct udp_sock *udp_sk, uid_t uid, int bucket) +static int bpf_iter_udp_realloc_batch(struct bpf_udp_iter_state *iter, + unsigned int new_batch_sz) +{ + struct sock **new_batch; + + new_batch = kvmalloc_array(new_batch_sz, sizeof(*new_batch), + GFP_USER | __GFP_NOWARN); + if (!new_batch) + return -ENOMEM; + + bpf_iter_udp_put_batch(iter); + kvfree(iter->batch); + iter->batch = new_batch; + iter->max_sk = new_batch_sz; + + return 0; +} + +#define INIT_BATCH_SZ 16 + static int bpf_iter_init_udp(void *priv_data, struct bpf_iter_aux_info *aux) { - return bpf_iter_init_seq_net(priv_data, aux); + struct bpf_udp_iter_state *iter = priv_data; + int ret; + + ret = bpf_iter_init_seq_net(priv_data, aux); + if (ret) + return ret; + + ret = bpf_iter_udp_realloc_batch(iter, INIT_BATCH_SZ); + if (ret) { + bpf_iter_fini_seq_net(priv_data); + return ret; + } + + return ret; } static void bpf_iter_fini_udp(void *priv_data) { + struct bpf_udp_iter_state *iter = priv_data; + bpf_iter_fini_seq_net(priv_data); + kvfree(iter->batch); } static const struct bpf_iter_seq_info udp_seq_info = { .seq_ops = &bpf_iter_udp_seq_ops, .init_seq_private = bpf_iter_init_udp, .fini_seq_private = bpf_iter_fini_udp, - .seq_priv_size = sizeof(struct udp_iter_state), + .seq_priv_size = sizeof(struct bpf_udp_iter_state), }; static struct bpf_iter_reg udp_reg_info = { From patchwork Tue Apr 18 15:31:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aditi Ghag X-Patchwork-Id: 13215825 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4F57BC77B7D for ; Tue, 18 Apr 2023 15:32:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230507AbjDRPcH (ORCPT ); Tue, 18 Apr 2023 11:32:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47670 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231623AbjDRPcF (ORCPT ); Tue, 18 Apr 2023 11:32:05 -0400 Received: from mail-pl1-x62a.google.com (mail-pl1-x62a.google.com [IPv6:2607:f8b0:4864:20::62a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E8F9ADB for ; Tue, 18 Apr 2023 08:32:03 -0700 (PDT) Received: by mail-pl1-x62a.google.com with SMTP id d9443c01a7336-1a682eee3baso12157605ad.0 for ; Tue, 18 Apr 2023 08:32:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1681831923; x=1684423923; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Q03tK4+7QCiPvv4pq1TZX4Aa8oEnCF+IJuLgtUqci9Q=; b=OLy9YjxWZ2q03SgsHUG2rziltYz97pPP6wE+4Mor9cwaP3yQydLVEbVNRNecCGppXi FoT3XhUN4YjQrdgd6FH9SHsPtHvO3aNwuDEsP4VkH8DKdhpzt4y86JOCy6q5e4LC2Zfi Ax0u990oz61Q8sgBqxRXjOWQQT2pU9DCmdi5DtJrathsGBz9BaQeT1c982HiX6YR2fmc mie0/mOcXBtm8Rg29bu5oPcocn7RQuhmgDCzZYKRd4q+sI9d91oCpCAdXZtXHvrs6j58 dUvEXtHItZi9Ap0Cp48vRhVcxLra3zz7q1XoKFp3K5esRhNBW8sk/XiORQbgOnkowOWr cliQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681831923; x=1684423923; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Q03tK4+7QCiPvv4pq1TZX4Aa8oEnCF+IJuLgtUqci9Q=; b=Ffl7FqFCf9m5EVaNFwqzTN92HXfV9WeNr/sBfQOJOIvEL2CQMkqfTE/FVNsM90m07h DObRRcJLU3qNwRp1d+ETNjAGhFg2neGmu7b6LId+4j076+wf/VGzJRbeJavNGAklULkZ RssX0a1U9Qvm0y8IrYlNJPD+R3iN6U55VPT3oUozSIGwotbItI3t22TheM1Xl0D/874w DY6I7JHoD9vgIUFDkHpTJu0sAho7V9Z5Cp3jHhHejm1WpI7UBNv+LC/R/eOaFn7gXxBw yFeDyLJI+NcUzuZIB8yKgaNf4V/0wo6qbPz43S3x4oXvJfilTn9i4UtpaqCo3e3XjwdQ yaMA== X-Gm-Message-State: AAQBX9flWPpy34AKEgMj3Wuge6krZMhdwOqZ6MaEotD5NCUVKePw625f t4sNxMxS//2T0LhBxp3/YFJnnPB2GpGOmji/9so= X-Google-Smtp-Source: AKy350YdJCaxFHTRgpeSL4nOssszfJSh7lHbeGFFHTpoU4GKENZh/o/DC9ri2kyyunyRCoao6hJqCQ== X-Received: by 2002:a17:902:f811:b0:1a6:46f2:4365 with SMTP id ix17-20020a170902f81100b001a646f24365mr2142192plb.30.1681831922920; Tue, 18 Apr 2023 08:32:02 -0700 (PDT) Received: from localhost.localdomain ([2604:1380:4611:8100::1]) by smtp.gmail.com with ESMTPSA id ba4-20020a170902720400b001a647709864sm9769630plb.155.2023.04.18.08.32.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Apr 2023 08:32:02 -0700 (PDT) From: Aditi Ghag To: bpf@vger.kernel.org Cc: kafai@fb.com, sdf@google.com, edumazet@google.com, aditi.ghag@isovalent.com Subject: [PATCH 5/7] bpf: Add bpf_sock_destroy kfunc Date: Tue, 18 Apr 2023 15:31:46 +0000 Message-Id: <20230418153148.2231644-6-aditi.ghag@isovalent.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230418153148.2231644-1-aditi.ghag@isovalent.com> References: <20230418153148.2231644-1-aditi.ghag@isovalent.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net The socket destroy kfunc is used to forcefully terminate sockets from certain BPF contexts. We plan to use the capability in Cilium to force client sockets to reconnect when their remote load-balancing backends are deleted. The other use case is on-the-fly policy enforcement where existing socket connections prevented by policies need to be forcefully terminated. The helper allows terminating sockets that may or may not be actively sending traffic. The helper is currently exposed to certain BPF iterators where users can filter, and terminate selected sockets. Additionally, the helper can only be called from these BPF contexts that ensure socket locking in order to allow synchronous execution of destroy helpers that also acquire socket locks. The previous commit that batches UDP sockets during iteration facilitated a synchronous invocation of the destroy helper from BPF context by skipping taking socket locks in the destroy handler. TCP iterators already supported batching. The helper takes `sock_common` type argument, even though it expects, and casts them to a `sock` pointer. This enables the verifier to allow the sock_destroy kfunc to be called for TCP with `sock_common` and UDP with `sock` structs. As a comparison, BPF helpers enable this behavior with the `ARG_PTR_TO_BTF_ID_SOCK_COMMON` argument type. However, there is no such option available with the verifier logic that handles kfuncs where BTF types are inferred. Furthermore, as `sock_common` only has a subset of certain fields of `sock`, casting pointer to the latter type might not always be safe for certain sockets like request sockets, but these have a special handling in the diag_destroy handlers. Signed-off-by: Aditi Ghag --- net/core/filter.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++ net/ipv4/tcp.c | 10 ++++++--- net/ipv4/udp.c | 6 +++-- 3 files changed, 68 insertions(+), 5 deletions(-) diff --git a/net/core/filter.c b/net/core/filter.c index 727c5269867d..7d1c1da77aa4 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -11715,3 +11715,60 @@ static int __init bpf_kfunc_init(void) return ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP, &bpf_kfunc_set_xdp); } late_initcall(bpf_kfunc_init); + +/* Disables missing prototype warnings */ +__diag_push(); +__diag_ignore_all("-Wmissing-prototypes", + "Global functions as their definitions will be in vmlinux BTF"); + +/* bpf_sock_destroy: Destroy the given socket with ECONNABORTED error code. + * + * The helper expects a non-NULL pointer to a socket. It invokes the + * protocol specific socket destroy handlers. + * + * The helper can only be called from BPF contexts that have acquired the socket + * locks. + * + * Parameters: + * @sock: Pointer to socket to be destroyed + * + * Return: + * On error, may return EPROTONOSUPPORT, EINVAL. + * EPROTONOSUPPORT if protocol specific destroy handler is not implemented. + * 0 otherwise + */ +__bpf_kfunc int bpf_sock_destroy(struct sock_common *sock) +{ + struct sock *sk = (struct sock *)sock; + + if (!sk) + return -EINVAL; + + /* The locking semantics that allow for synchronous execution of the + * destroy handlers are only supported for TCP and UDP. + * Supporting protocols will need to acquire lock_sock in the BPF context + * prior to invoking this kfunc. + */ + if (!sk->sk_prot->diag_destroy || (sk->sk_protocol != IPPROTO_TCP && + sk->sk_protocol != IPPROTO_UDP)) + return -EOPNOTSUPP; + + return sk->sk_prot->diag_destroy(sk, ECONNABORTED); +} + +__diag_pop() + +BTF_SET8_START(sock_destroy_kfunc_set) +BTF_ID_FLAGS(func, bpf_sock_destroy) +BTF_SET8_END(sock_destroy_kfunc_set) + +static const struct btf_kfunc_id_set bpf_sock_destroy_kfunc_set = { + .owner = THIS_MODULE, + .set = &sock_destroy_kfunc_set, +}; + +static int init_subsystem(void) +{ + return register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &bpf_sock_destroy_kfunc_set); +} +late_initcall(init_subsystem); diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 288693981b00..2259b4facc2f 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -4679,8 +4679,10 @@ int tcp_abort(struct sock *sk, int err) return 0; } - /* Don't race with userspace socket closes such as tcp_close. */ - lock_sock(sk); + /* BPF context ensures sock locking. */ + if (!has_current_bpf_ctx()) + /* Don't race with userspace socket closes such as tcp_close. */ + lock_sock(sk); if (sk->sk_state == TCP_LISTEN) { tcp_set_state(sk, TCP_CLOSE); @@ -4702,9 +4704,11 @@ int tcp_abort(struct sock *sk, int err) } bh_unlock_sock(sk); + local_bh_enable(); tcp_write_queue_purge(sk); - release_sock(sk); + if (!has_current_bpf_ctx()) + release_sock(sk); return 0; } EXPORT_SYMBOL_GPL(tcp_abort); diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index f1c001641e53..a358a71839ef 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -2925,7 +2925,8 @@ EXPORT_SYMBOL(udp_poll); int udp_abort(struct sock *sk, int err) { - lock_sock(sk); + if (!has_current_bpf_ctx()) + lock_sock(sk); /* udp{v6}_destroy_sock() sets it under the sk lock, avoid racing * with close() @@ -2938,7 +2939,8 @@ int udp_abort(struct sock *sk, int err) __udp_disconnect(sk, 0); out: - release_sock(sk); + if (!has_current_bpf_ctx()) + release_sock(sk); return 0; } From patchwork Tue Apr 18 15:31:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aditi Ghag X-Patchwork-Id: 13215826 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C8CBC77B78 for ; Tue, 18 Apr 2023 15:32:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232299AbjDRPcI (ORCPT ); Tue, 18 Apr 2023 11:32:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47678 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232292AbjDRPcF (ORCPT ); Tue, 18 Apr 2023 11:32:05 -0400 Received: from mail-pf1-x431.google.com (mail-pf1-x431.google.com [IPv6:2607:f8b0:4864:20::431]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B8D63F9 for ; Tue, 18 Apr 2023 08:32:04 -0700 (PDT) Received: by mail-pf1-x431.google.com with SMTP id d2e1a72fcca58-63b50a02bffso1975054b3a.2 for ; Tue, 18 Apr 2023 08:32:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1681831924; x=1684423924; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ktFaKMGngA4cL2xs6PyudZLjddv2AgUi9/a08l00+R4=; b=MbSNSub8sfznwsNLTMmkP/VVIzhIHIAnpH+S/C/S5Ita2MC4mk2pL6b356XX05xTKW 15mF1P0wVuX9LoKm5KFFtNO3k52f8ZuiTV91Utp2YfzqpnchP+o2EfsoW0WDMdQMb1ty XfS0acI32c5gCUnMFq96vDxAo5rCvihNGwDqOHEzfJxCIdB7LyaQtBc+xI1Z1J95NkPa mXQv1Q9sy93ZQ4cmj6eP4CbQ4Sl4hTOC1UWtF/nB9L4ynD5I+H7c/ysXqYBi0jDAMt8i aTWB3r+cOJVD1e8PywQhs4UWAevD8L1JJHceBcyQeyJoN0kPIvSQ92Rjiigfxvxom6Fu uHGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681831924; x=1684423924; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ktFaKMGngA4cL2xs6PyudZLjddv2AgUi9/a08l00+R4=; b=OUUPjYTv+XH3sjRLxd5/CeqQnmMzDAzAx2F4fDkVTvFqxCp7p0oA3/sjg2+uMJrCDo kqDc/p9gX0qXQr+WJwd9HdUjaZW66xLqiATDOpfGhoaKfj9ZGGwRHmyRJtlw6ZI2kAmj rD2wyUUOIC7IaHSQpUQSpQz2dIBMtW4dXoABRIWBQDj9mtAf4BSAU+Jzca5zS+S/IGHi pbHuH3ng4CHblqdNAFBDoTf8Yl5agb4MSYafTineOyiVhmEtJJBn1NERHoHZuGVdMjPr i8X24DH4G2K2i8t8re4GCbmKR00cdEZ0KIvFHn/oHo51ewMeCFFhZkka4WJIiVPRZFHo BhKA== X-Gm-Message-State: AAQBX9dHhSH72O4fyXspahJ2MXrjuVLZfNJkPHbHWS8LcfP159q8kJUA WWCl8jKfiV55k1PmcAKsUB5MhJTV/kwdarobAeU= X-Google-Smtp-Source: AKy350bzrP0RkVMtRPcfR57gITKw4L2XfU/Is17BArTQMIBiLyDVYayvOnVWgJZ4e+CyDWBUVV7aKQ== X-Received: by 2002:a17:902:ec90:b0:19d:1bc1:ce22 with SMTP id x16-20020a170902ec9000b0019d1bc1ce22mr2928834plg.5.1681831923899; Tue, 18 Apr 2023 08:32:03 -0700 (PDT) Received: from localhost.localdomain ([2604:1380:4611:8100::1]) by smtp.gmail.com with ESMTPSA id ba4-20020a170902720400b001a647709864sm9769630plb.155.2023.04.18.08.32.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Apr 2023 08:32:03 -0700 (PDT) From: Aditi Ghag To: bpf@vger.kernel.org Cc: kafai@fb.com, sdf@google.com, edumazet@google.com, aditi.ghag@isovalent.com Subject: [PATCH 6/7] selftests/bpf: Add helper to get port using getsockname Date: Tue, 18 Apr 2023 15:31:47 +0000 Message-Id: <20230418153148.2231644-7-aditi.ghag@isovalent.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230418153148.2231644-1-aditi.ghag@isovalent.com> References: <20230418153148.2231644-1-aditi.ghag@isovalent.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net The helper will be used to programmatically retrieve, and pass ports in userspace and kernel selftest programs. Suggested-by: Stanislav Fomichev Signed-off-by: Aditi Ghag --- tools/testing/selftests/bpf/network_helpers.c | 28 +++++++++++++++++++ tools/testing/selftests/bpf/network_helpers.h | 1 + 2 files changed, 29 insertions(+) diff --git a/tools/testing/selftests/bpf/network_helpers.c b/tools/testing/selftests/bpf/network_helpers.c index 596caa176582..7217cac762f0 100644 --- a/tools/testing/selftests/bpf/network_helpers.c +++ b/tools/testing/selftests/bpf/network_helpers.c @@ -427,3 +427,31 @@ void close_netns(struct nstoken *token) close(token->orig_netns_fd); free(token); } + +int get_socket_local_port(int family, int sock_fd, __u16 *out_port) +{ + socklen_t addr_len; + int err; + + if (family == AF_INET) { + struct sockaddr_in addr = {}; + + addr_len = sizeof(addr); + err = getsockname(sock_fd, (struct sockaddr *)&addr, &addr_len); + if (err < 0) + return err; + *out_port = addr.sin_port; + return 0; + } else if (family == AF_INET6) { + struct sockaddr_in6 addr = {}; + + addr_len = sizeof(addr); + err = getsockname(sock_fd, (struct sockaddr *)&addr, &addr_len); + if (err < 0) + return err; + *out_port = addr.sin6_port; + return 0; + } + + return -1; +} diff --git a/tools/testing/selftests/bpf/network_helpers.h b/tools/testing/selftests/bpf/network_helpers.h index f882c691b790..ca4a147b58b8 100644 --- a/tools/testing/selftests/bpf/network_helpers.h +++ b/tools/testing/selftests/bpf/network_helpers.h @@ -56,6 +56,7 @@ int fastopen_connect(int server_fd, const char *data, unsigned int data_len, int make_sockaddr(int family, const char *addr_str, __u16 port, struct sockaddr_storage *addr, socklen_t *len); char *ping_command(int family); +int get_socket_local_port(int family, int sock_fd, __u16 *out_port); struct nstoken; /** From patchwork Tue Apr 18 15:31:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aditi Ghag X-Patchwork-Id: 13215827 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5587CC7EE21 for ; Tue, 18 Apr 2023 15:32:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231623AbjDRPcJ (ORCPT ); Tue, 18 Apr 2023 11:32:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47740 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232301AbjDRPcH (ORCPT ); Tue, 18 Apr 2023 11:32:07 -0400 Received: from mail-pg1-x533.google.com (mail-pg1-x533.google.com [IPv6:2607:f8b0:4864:20::533]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C4E0F9025 for ; Tue, 18 Apr 2023 08:32:05 -0700 (PDT) Received: by mail-pg1-x533.google.com with SMTP id 41be03b00d2f7-51f6461af24so585197a12.2 for ; Tue, 18 Apr 2023 08:32:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1681831925; x=1684423925; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=jfKeIulC4jV6FZNcjDlGGOvIECkdyQT5zakxM/ras2g=; b=ZSIIqF4c+fR/o5Zpydywns8+5LbrHvfoiPTv2m9CO/ZS+pzS9npT/ndWOTo0vIEjwj Tmyg5Y+WoLauu9GWuSuGDRymf8g19hNMKpzd9R9F5+lvm1vT/NCKpOO3LxEgruTiOfAo DVmEDBG2O67DBvtLY4GCLED+8I3h+xqgM63esDasWiuKHjZApGTOpjDO7uuxs3qtw4EL eIS+o/VqsWxeLPSVJjr/SS2HBAFrxY269B+YtC5E+jAd/slKp4sUL63VPFkdQTekIan8 Au4EWzyrAR6D77sKSPmh5Hocuz30wmrQNO09nDmUmg9BYsWfCBp/SEs5rLljSmFHalM6 NNgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681831925; x=1684423925; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jfKeIulC4jV6FZNcjDlGGOvIECkdyQT5zakxM/ras2g=; b=WnxR3WtPtTlOdY9Jx/wMR7yjztzRBg1blRqmzn9aVwmfBG+DQVIADlIq149Q0rik5w JdVnUxk08fhpGfOeHs2I+/1iditfFVjztgF8pkNHIfzXl4Lv0xXs2XDyDmlNhdRDCRYx kefIJRWfoIggNx7Noz/SbjeEzDczrhrQA9fJsUHk+EG1hlylkcPGPaQQVrcU0/EBH2uf Ic3pt0byd7zk/LMZJ68Aa8xbcSaSNKtHgMmNMnAUjrIHCT+XAPBKopt3fiFfOhkUfP0k 96F0Dc6CsBbMVb+rCiP9RxTaFphTY+fPlAQM34SfI7cikisVftYCYldyY7yNb3LH/QTt pi/g== X-Gm-Message-State: AAQBX9cEXzg5sHvlt5IHoZxo0ekerxdt/eXUTZvl3Kyov4EJW5im+qES J3oNo2LrrEVjBzryGX4lR+aGlWSX9b6LufViFQo= X-Google-Smtp-Source: AKy350ahJHSaJD2xc16iTIomdCWSvUWfxKmDh2TYhYgw0BsBXi+BjWwRmbGBDUJcH9zM3Gcaq7HqIA== X-Received: by 2002:a17:902:ecc4:b0:1a6:c366:1603 with SMTP id a4-20020a170902ecc400b001a6c3661603mr2874644plh.19.1681831924947; Tue, 18 Apr 2023 08:32:04 -0700 (PDT) Received: from localhost.localdomain ([2604:1380:4611:8100::1]) by smtp.gmail.com with ESMTPSA id ba4-20020a170902720400b001a647709864sm9769630plb.155.2023.04.18.08.32.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Apr 2023 08:32:04 -0700 (PDT) From: Aditi Ghag To: bpf@vger.kernel.org Cc: kafai@fb.com, sdf@google.com, edumazet@google.com, aditi.ghag@isovalent.com Subject: [PATCH 7/7] selftests/bpf: Test bpf_sock_destroy Date: Tue, 18 Apr 2023 15:31:48 +0000 Message-Id: <20230418153148.2231644-8-aditi.ghag@isovalent.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230418153148.2231644-1-aditi.ghag@isovalent.com> References: <20230418153148.2231644-1-aditi.ghag@isovalent.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net The test cases for destroying sockets mirror the intended usages of the bpf_sock_destroy kfunc using iterators. The destroy helpers set `ECONNABORTED` error code that we can validate in the test code with client sockets. But UDP sockets have an overriding error code from the disconnect called during abort, so the error code the validation is only done for TCP sockets. Signed-off-by: Aditi Ghag --- .../selftests/bpf/prog_tests/sock_destroy.c | 217 ++++++++++++++++++ .../selftests/bpf/progs/sock_destroy_prog.c | 147 ++++++++++++ 2 files changed, 364 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/sock_destroy.c create mode 100644 tools/testing/selftests/bpf/progs/sock_destroy_prog.c diff --git a/tools/testing/selftests/bpf/prog_tests/sock_destroy.c b/tools/testing/selftests/bpf/prog_tests/sock_destroy.c new file mode 100644 index 000000000000..51f2454b7b4b --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/sock_destroy.c @@ -0,0 +1,217 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include + +#include "sock_destroy_prog.skel.h" +#include "network_helpers.h" + +#define TEST_NS "sock_destroy_netns" + +static void start_iter_sockets(struct bpf_program *prog) +{ + struct bpf_link *link; + char buf[50] = {}; + int iter_fd, len; + + link = bpf_program__attach_iter(prog, NULL); + if (!ASSERT_OK_PTR(link, "attach_iter")) + return; + + iter_fd = bpf_iter_create(bpf_link__fd(link)); + if (!ASSERT_GE(iter_fd, 0, "create_iter")) + goto free_link; + + while ((len = read(iter_fd, buf, sizeof(buf))) > 0) + ; + ASSERT_GE(len, 0, "read"); + + close(iter_fd); + +free_link: + bpf_link__destroy(link); +} + +static void test_tcp_client(struct sock_destroy_prog *skel) +{ + int serv = -1, clien = -1, n = 0; + + serv = start_server(AF_INET6, SOCK_STREAM, NULL, 0, 0); + if (!ASSERT_GE(serv, 0, "start_server")) + goto cleanup_serv; + + clien = connect_to_fd(serv, 0); + if (!ASSERT_GE(clien, 0, "connect_to_fd")) + goto cleanup_serv; + + serv = accept(serv, NULL, NULL); + if (!ASSERT_GE(serv, 0, "serv accept")) + goto cleanup; + + n = send(clien, "t", 1, 0); + if (!ASSERT_GE(n, 0, "client send")) + goto cleanup; + + /* Run iterator program that destroys connected client sockets. */ + start_iter_sockets(skel->progs.iter_tcp6_client); + + n = send(clien, "t", 1, 0); + if (!ASSERT_LT(n, 0, "client_send on destroyed socket")) + goto cleanup; + ASSERT_EQ(errno, ECONNABORTED, "error code on destroyed socket"); + + +cleanup: + close(clien); +cleanup_serv: + close(serv); +} + +static void test_tcp_server(struct sock_destroy_prog *skel) +{ + int serv = -1, clien = -1, n = 0, err; + __u16 serv_port = 0; + + serv = start_server(AF_INET6, SOCK_STREAM, NULL, 0, 0); + if (!ASSERT_GE(serv, 0, "start_server")) + goto cleanup_serv; + err = get_socket_local_port(AF_INET6, serv, &serv_port); + if (!ASSERT_EQ(err, 0, "get_local_port")) + goto cleanup; + skel->bss->serv_port = serv_port; + + clien = connect_to_fd(serv, 0); + if (!ASSERT_GE(clien, 0, "connect_to_fd")) + goto cleanup_serv; + + serv = accept(serv, NULL, NULL); + if (!ASSERT_GE(serv, 0, "serv accept")) + goto cleanup; + + n = send(clien, "t", 1, 0); + if (!ASSERT_GE(n, 0, "client send")) + goto cleanup; + + /* Run iterator program that destroys server sockets. */ + start_iter_sockets(skel->progs.iter_tcp6_server); + + n = send(clien, "t", 1, 0); + if (!ASSERT_LT(n, 0, "client_send on destroyed socket")) + goto cleanup; + ASSERT_EQ(errno, ECONNRESET, "error code on destroyed socket"); + + +cleanup: + close(clien); +cleanup_serv: + close(serv); +} + + +static void test_udp_client(struct sock_destroy_prog *skel) +{ + int serv = -1, clien = -1, n = 0; + + serv = start_server(AF_INET6, SOCK_DGRAM, NULL, 0, 0); + if (!ASSERT_GE(serv, 0, "start_server")) + goto cleanup_serv; + + clien = connect_to_fd(serv, 0); + if (!ASSERT_GE(clien, 0, "connect_to_fd")) + goto cleanup_serv; + + n = send(clien, "t", 1, 0); + if (!ASSERT_GE(n, 0, "client send")) + goto cleanup; + + /* Run iterator program that destroys sockets. */ + start_iter_sockets(skel->progs.iter_udp6_client); + + n = send(clien, "t", 1, 0); + if (!ASSERT_LT(n, 0, "client_send on destroyed socket")) + goto cleanup; + /* UDP sockets have an overriding error code after they are disconnected, + * so we don't check for ECONNABORTED error code. + */ + +cleanup: + close(clien); +cleanup_serv: + close(serv); +} + +static void test_udp_server(struct sock_destroy_prog *skel) +{ + int *listen_fds = NULL, n, i, err; + unsigned int num_listens = 5; + char buf[1]; + __u16 serv_port; + + /* Start reuseport servers. */ + listen_fds = start_reuseport_server(AF_INET6, SOCK_DGRAM, + "::1", 0, 0, num_listens); + if (!ASSERT_OK_PTR(listen_fds, "start_reuseport_server")) + goto cleanup; + err = get_socket_local_port(AF_INET6, listen_fds[0], &serv_port); + if (!ASSERT_EQ(err, 0, "get_local_port")) + goto cleanup; + skel->bss->serv_port = ntohs(serv_port); + + /* Run iterator program that destroys server sockets. */ + start_iter_sockets(skel->progs.iter_udp6_server); + + for (i = 0; i < num_listens; ++i) { + n = read(listen_fds[i], buf, sizeof(buf)); + if (!ASSERT_EQ(n, -1, "read") || + !ASSERT_EQ(errno, ECONNABORTED, "error code on destroyed socket")) + break; + } + ASSERT_EQ(i, num_listens, "server socket"); + +cleanup: + free_fds(listen_fds, num_listens); +} + +void test_sock_destroy(void) +{ + struct sock_destroy_prog *skel; + struct nstoken *nstoken = NULL; + int cgroup_fd = 0; + + skel = sock_destroy_prog__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_open")) + return; + + cgroup_fd = test__join_cgroup("/sock_destroy"); + if (!ASSERT_GE(cgroup_fd, 0, "join_cgroup")) + goto close_cgroup_fd; + + skel->links.sock_connect = bpf_program__attach_cgroup( + skel->progs.sock_connect, cgroup_fd); + if (!ASSERT_OK_PTR(skel->links.sock_connect, "prog_attach")) + goto close_cgroup_fd; + + SYS(fail, "ip netns add %s", TEST_NS); + SYS(fail, "ip -net %s link set dev lo up", TEST_NS); + + nstoken = open_netns(TEST_NS); + if (!ASSERT_OK_PTR(nstoken, "open_netns")) + goto fail; + + if (test__start_subtest("tcp_client")) + test_tcp_client(skel); + if (test__start_subtest("tcp_server")) + test_tcp_server(skel); + if (test__start_subtest("udp_client")) + test_udp_client(skel); + if (test__start_subtest("udp_server")) + test_udp_server(skel); + + +fail: + if (nstoken) + close_netns(nstoken); + SYS_NOFAIL("ip netns del " TEST_NS " &> /dev/null"); +close_cgroup_fd: + close(cgroup_fd); + sock_destroy_prog__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/progs/sock_destroy_prog.c b/tools/testing/selftests/bpf/progs/sock_destroy_prog.c new file mode 100644 index 000000000000..1f265e0d9dea --- /dev/null +++ b/tools/testing/selftests/bpf/progs/sock_destroy_prog.c @@ -0,0 +1,147 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include "vmlinux.h" +#include +#include + +#include "bpf_tracing_net.h" + +#define AF_INET6 10 + +__u16 serv_port = 0; + +int bpf_sock_destroy(struct sock_common *sk) __ksym; + +struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __uint(max_entries, 1); + __type(key, __u32); + __type(value, __u64); +} tcp_conn_sockets SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __uint(max_entries, 1); + __type(key, __u32); + __type(value, __u64); +} udp_conn_sockets SEC(".maps"); + +SEC("cgroup/connect6") +int sock_connect(struct bpf_sock_addr *ctx) +{ + int key = 0; + __u64 sock_cookie = 0; + __u32 keyc = 0; + + if (ctx->family != AF_INET6 || ctx->user_family != AF_INET6) + return 1; + + sock_cookie = bpf_get_socket_cookie(ctx); + if (ctx->protocol == IPPROTO_TCP) + bpf_map_update_elem(&tcp_conn_sockets, &key, &sock_cookie, 0); + else if (ctx->protocol == IPPROTO_UDP) + bpf_map_update_elem(&udp_conn_sockets, &keyc, &sock_cookie, 0); + else + return 1; + + return 1; +} + +SEC("iter/tcp") +int iter_tcp6_client(struct bpf_iter__tcp *ctx) +{ + struct sock_common *sk_common = ctx->sk_common; + __u64 sock_cookie = 0; + __u64 *val; + int key = 0; + + if (!sk_common) + return 0; + + if (sk_common->skc_family != AF_INET6) + return 0; + + sock_cookie = bpf_get_socket_cookie(sk_common); + val = bpf_map_lookup_elem(&tcp_conn_sockets, &key); + if (!val) + return 0; + /* Destroy connected client sockets. */ + if (sock_cookie == *val) + bpf_sock_destroy(sk_common); + + return 0; +} + +SEC("iter/tcp") +int iter_tcp6_server(struct bpf_iter__tcp *ctx) +{ + struct sock_common *sk_common = ctx->sk_common; + struct tcp6_sock *tcp_sk; + const struct inet_connection_sock *icsk; + const struct inet_sock *inet; + __u16 srcp; + + if (!sk_common) + return 0; + + if (sk_common->skc_family != AF_INET6) + return 0; + + tcp_sk = bpf_skc_to_tcp6_sock(sk_common); + if (!tcp_sk) + return 0; + + icsk = &tcp_sk->tcp.inet_conn; + inet = &icsk->icsk_inet; + srcp = inet->inet_sport; + + /* Destroy server sockets. */ + if (srcp == serv_port) + bpf_sock_destroy(sk_common); + + return 0; +} + + +SEC("iter/udp") +int iter_udp6_client(struct bpf_iter__udp *ctx) +{ + struct udp_sock *udp_sk = ctx->udp_sk; + struct sock *sk = (struct sock *) udp_sk; + __u64 sock_cookie = 0, *val; + int key = 0; + + if (!sk) + return 0; + + sock_cookie = bpf_get_socket_cookie(sk); + val = bpf_map_lookup_elem(&udp_conn_sockets, &key); + if (!val) + return 0; + /* Destroy connected client sockets. */ + if (sock_cookie == *val) + bpf_sock_destroy((struct sock_common *)sk); + + return 0; +} + +SEC("iter/udp") +int iter_udp6_server(struct bpf_iter__udp *ctx) +{ + struct udp_sock *udp_sk = ctx->udp_sk; + struct sock *sk = (struct sock *) udp_sk; + __u16 srcp; + struct inet_sock *inet; + + if (!sk) + return 0; + + inet = &udp_sk->inet; + srcp = bpf_ntohs(inet->inet_sport); + if (srcp == serv_port) + bpf_sock_destroy((struct sock_common *)sk); + + return 0; +} + +char _license[] SEC("license") = "GPL";