From patchwork Wed May 17 17:54:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aditi Ghag X-Patchwork-Id: 13245482 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C8C9E10966 for ; Wed, 17 May 2023 17:55:17 +0000 (UTC) Received: from mail-pl1-x630.google.com (mail-pl1-x630.google.com [IPv6:2607:f8b0:4864:20::630]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A07C71A7 for ; Wed, 17 May 2023 10:55:15 -0700 (PDT) Received: by mail-pl1-x630.google.com with SMTP id d9443c01a7336-1ae52ce3250so9350755ad.2 for ; Wed, 17 May 2023 10:55:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1684346115; x=1686938115; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=8WvC9+CwsWYyMunEtQcJi3ps8MGLdx6dieRQPHo9XmA=; b=MQPO2PvezUKh7aSM92mvjfiA0iynR2J3NJ4Zm8Dbc9ep6jjJU8RGx72Ls1HXfNuymg JO/8nRHev3smf7wMSOvo6Sn/owXLwpecR61ur+6Ty+RQfnsbZwAeR4lDrc3Drb1UTbfY 7YXbpXJ+uJYGZC0vH8HODKHYzGYXW7OlCsuWICB0RJLvR/ZYj2VgCDBrpNpOaGYI2v4h mXrB8sPsHxtTiwvaaWBmVEpwKr8K+1VqTl3XzHyJ1XJR83LgzNY6VJ22DqTxjWy97EcF GMOTckgCOEOcpYOMBbS8iZb7D09SJCr/pNaQdYJSn6vpVCmcnCeDnrlf8mJsvblRY+be kUBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684346115; x=1686938115; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=8WvC9+CwsWYyMunEtQcJi3ps8MGLdx6dieRQPHo9XmA=; b=AM3b20ns9VNYo5vIJZ73escBG8dVCQh/ns/ujWtmJyL0ru6Krw7KNfGFILhtCQ3Tou moD/tSLK+lzPkhaCmKyhm3jCTJNFTB63XXse6Z8ykY7PDNpfkw7+t5VOnZ3G/L/93M/j /FaDf6bPA6aljT0BjhTLq4FN3iVk6sA9+Ou6gY9tGIfEdYVGsZZh4fAtSnM7Z0WAQKbT P1Uc/87LgkEfZWGvN9yHEXBNYJRq6jii0YXClw5W0GRz2IQB+XwlsRBqs1LVum+Qzxvy jllwdfI2CWJK6zdk7JKWL6zcgtpe9fy2u7bZ9suTETr0sQWJpmKaM/o/s1SJf6yNv1hp A4mQ== X-Gm-Message-State: AC+VfDwH8gCNxq13QMOobYs4f1P/Bw0vReSU7LwXBz6knhZ+OsaIMu0W vSKSeJvGvTfZeT3tkiaSDMuiWUqLVgn35r0wU+4= X-Google-Smtp-Source: ACHHUZ7JF85/bthysegngN2uB62u0BLnePIsFBQsLzPOzWadnC9S6cGhEF/0tW6ktJnZeNESugTArw== X-Received: by 2002:a17:902:d507:b0:1ac:310d:872d with SMTP id b7-20020a170902d50700b001ac310d872dmr54785365plg.52.1684346114852; Wed, 17 May 2023 10:55:14 -0700 (PDT) Received: from localhost.localdomain ([2604:1380:4611:8100::1]) by smtp.gmail.com with ESMTPSA id b17-20020a170903229100b001ab39cd885esm17828882plh.212.2023.05.17.10.55.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 May 2023 10:55:14 -0700 (PDT) From: Aditi Ghag To: bpf@vger.kernel.org Cc: kafai@fb.com, sdf@google.com, yhs@meta.com, aditi.ghag@isovalent.com Subject: [PATCH v8 bpf-next 01/10] bpf: tcp: Avoid taking fast sock lock in iterator Date: Wed, 17 May 2023 17:54:58 +0000 Message-Id: <20230517175458.527970-1-aditi.ghag@isovalent.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net This is a preparatory commit to replace `lock_sock_fast` with `lock_sock`, and faciliate BPF programs executed from the iterator to be able to destroy TCP listening sockets using the bpf_sock_destroy kfunc (implemened in follow-up commits). Previously, BPF TCP iterator was acquiring the sock lock with BH disabled. This led to scenarios where the sockets hash table bucket lock can be acquired with BH enabled in some context versus disabled in other, and caused a -> dependency with the sock lock. Here is a snippet of annotated stack trace that motivated this change: ``` Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&h->lhash2[i].lock); local_irq_disable(); lock(slock-AF_INET6); lock(&h->lhash2[i].lock); lock(slock-AF_INET6); *** DEADLOCK *** process context: lock_acquire+0xcd/0x330 _raw_spin_lock+0x33/0x40 ------> Acquire (bucket) lhash2.lock with BH enabled __inet_hash+0x4b/0x210 inet_csk_listen_start+0xe6/0x100 inet_listen+0x95/0x1d0 __sys_listen+0x69/0xb0 __x64_sys_listen+0x14/0x20 do_syscall_64+0x3c/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc bpf_sock_destroy run from iterator in interrupt context: lock_acquire+0xcd/0x330 _raw_spin_lock+0x33/0x40 ------> Acquire (bucket) lhash2.lock with BH disabled inet_unhash+0x9a/0x110 tcp_set_state+0x6a/0x210 tcp_abort+0x10d/0x200 bpf_prog_6793c5ca50c43c0d_iter_tcp6_server+0xa4/0xa9 bpf_iter_run_prog+0x1ff/0x340 ------> lock_sock_fast that acquires sock lock with BH disabled bpf_iter_tcp_seq_show+0xca/0x190 bpf_seq_read+0x177/0x450 ``` Acked-by: Yonghong Song Acked-by: Stanislav Fomichev Signed-off-by: Aditi Ghag --- net/ipv4/tcp_ipv4.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index ea370afa70ed..f2d370a9450f 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -2962,7 +2962,6 @@ static int bpf_iter_tcp_seq_show(struct seq_file *seq, void *v) struct bpf_iter_meta meta; struct bpf_prog *prog; struct sock *sk = v; - bool slow; uid_t uid; int ret; @@ -2970,7 +2969,7 @@ static int bpf_iter_tcp_seq_show(struct seq_file *seq, void *v) return 0; if (sk_fullsock(sk)) - slow = lock_sock_fast(sk); + lock_sock(sk); if (unlikely(sk_unhashed(sk))) { ret = SEQ_SKIP; @@ -2994,7 +2993,7 @@ static int bpf_iter_tcp_seq_show(struct seq_file *seq, void *v) unlock: if (sk_fullsock(sk)) - unlock_sock_fast(sk, slow); + release_sock(sk); return ret; } From patchwork Wed May 17 17:55:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aditi Ghag X-Patchwork-Id: 13245483 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6E3C310966 for ; Wed, 17 May 2023 17:55:33 +0000 (UTC) Received: from mail-pf1-x42d.google.com (mail-pf1-x42d.google.com [IPv6:2607:f8b0:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 89FE6BE for ; Wed, 17 May 2023 10:55:31 -0700 (PDT) Received: by mail-pf1-x42d.google.com with SMTP id d2e1a72fcca58-6439f186366so787684b3a.2 for ; Wed, 17 May 2023 10:55:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1684346131; x=1686938131; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=qwmHpgDmVlTo8Kp5M3jmfH1KC96jBeC1s7SoCPSCjuo=; b=cguQi9Wa88u74CLoidFpgp4ILYJIHpAvUY5uuNqwy5fmtZenAW/Q4AbbXeUyh1Ny6c ebILhQDE8CxacdJd2tBJtuwwlkEuWjD+fhHoiMIiJeTbX3TUD/9TO7OZ26JykN9Nzvbp dkw2CP01W9AoVzX+HkQK1WC2DEGfqjNzOXaLh47ai+QJ76xM13ArunuvCVG/ZpX2mxwt 9f5jNrsR9HlHB2OLBAzgYUelGzvAp1gP4l6m1WCm8Tja8qvJWOmvVwM2lzSmcpuAOL/0 XrdG7HOJ/WnlXjgCywvK28SlCsAjJCEdT9RY1TQC1pxb6Mpco1d3xF1KAnDBPJEsnPVg lC+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684346131; x=1686938131; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=qwmHpgDmVlTo8Kp5M3jmfH1KC96jBeC1s7SoCPSCjuo=; b=PVJBY2scCMCdk0MOWcZzKm61BVsDT6veIzVU+oGchvs+Uf3t/wcK6OQifgYv+s/uOB b7LytRQB/wIpMXExEt/2rUhpEKo08RF9Ir5NONkUEFBLCoGio+kx3xpo/5auOf9wsGIn iA5D6hsZ/kINErHDdmOEEZ82Nv2zV72IQLkaNqZ9xZBlWkxzKM1x8jXF5sjtfroCw3zJ lJvAomTSUMPC9NL9PHX6QQUc6h38yWiZ9dnUiFoXcplxKELxvVDNIdcOj+Qfr3Ajcrbc HrfcFnWBwYA6Fc+SgI9Oi1OkBfJWWTFL3bg2dAcZbofldSBSWo7PHiEEDkXq72y9KG8B rFWA== X-Gm-Message-State: AC+VfDx99Kf/G6LNPfGXj1cuk6NJgehuYrhsvBQpxQiCdh/RaPAbkkTD GJuWIJDM37Mh+e+UVMbeInK680KgOipAoh94PLI= X-Google-Smtp-Source: ACHHUZ4NLMm8p6CzTp9maeiojyGTSQksY0L51Lsaynw6Rh8vmR4zSsa17VCMwHbCX58edv8O4JEDMA== X-Received: by 2002:a05:6a00:a21:b0:64c:c836:662d with SMTP id p33-20020a056a000a2100b0064cc836662dmr675567pfh.20.1684346130741; Wed, 17 May 2023 10:55:30 -0700 (PDT) Received: from localhost.localdomain ([2604:1380:4611:8100::1]) by smtp.gmail.com with ESMTPSA id b23-20020aa78117000000b0063d670ad850sm6044373pfi.92.2023.05.17.10.55.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 May 2023 10:55:30 -0700 (PDT) From: Aditi Ghag To: bpf@vger.kernel.org Cc: kafai@fb.com, sdf@google.com, aditi.ghag@isovalent.com Subject: [PATCH v8 bpf-next 02/10] udp: seq_file: Helper function to match socket attributes Date: Wed, 17 May 2023 17:55:25 +0000 Message-Id: <20230517175525.528000-1-aditi.ghag@isovalent.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net This is a preparatory commit to refactor code that matches socket attributes in iterators to a helper function, and use it in the proc fs iterator. Signed-off-by: Aditi Ghag --- net/ipv4/udp.c | 34 +++++++++++++++++++++++++++------- 1 file changed, 27 insertions(+), 7 deletions(-) diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index c605d171eb2d..71e3fef44fd5 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -2983,6 +2983,16 @@ EXPORT_SYMBOL(udp_prot); /* ------------------------------------------------------------------------ */ #ifdef CONFIG_PROC_FS +static unsigned short seq_file_family(const struct seq_file *seq); +static bool seq_sk_match(struct seq_file *seq, const struct sock *sk) +{ + unsigned short family = seq_file_family(seq); + + /* AF_UNSPEC is used as a match all */ + return ((family == AF_UNSPEC || family == sk->sk_family) && + net_eq(sock_net(sk), seq_file_net(seq))); +} + static struct udp_table *udp_get_table_afinfo(struct udp_seq_afinfo *afinfo, struct net *net) { @@ -3013,10 +3023,7 @@ static struct sock *udp_get_first(struct seq_file *seq, int start) spin_lock_bh(&hslot->lock); sk_for_each(sk, &hslot->head) { - if (!net_eq(sock_net(sk), net)) - continue; - if (afinfo->family == AF_UNSPEC || - sk->sk_family == afinfo->family) + if (seq_sk_match(seq, sk)) goto found; } spin_unlock_bh(&hslot->lock); @@ -3040,9 +3047,7 @@ static struct sock *udp_get_next(struct seq_file *seq, struct sock *sk) do { sk = sk_next(sk); - } while (sk && (!net_eq(sock_net(sk), net) || - (afinfo->family != AF_UNSPEC && - sk->sk_family != afinfo->family))); + } while (sk && !seq_sk_match(seq, sk)); if (!sk) { udptable = udp_get_table_afinfo(afinfo, net); @@ -3205,6 +3210,21 @@ static const struct seq_operations bpf_iter_udp_seq_ops = { }; #endif +static unsigned short seq_file_family(const struct seq_file *seq) +{ + const struct udp_seq_afinfo *afinfo; + +#ifdef CONFIG_BPF_SYSCALL + /* BPF iterator: bpf programs to filter sockets. */ + if (seq->op == &bpf_iter_udp_seq_ops) + return AF_UNSPEC; +#endif + + /* Proc fs iterator */ + afinfo = pde_data(file_inode(seq->file)); + return afinfo->family; +} + const struct seq_operations udp_seq_ops = { .start = udp_seq_start, .next = udp_seq_next, From patchwork Wed May 17 17:56:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aditi Ghag X-Patchwork-Id: 13245484 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 55E5410966 for ; Wed, 17 May 2023 17:56:38 +0000 (UTC) Received: from mail-pj1-x102f.google.com (mail-pj1-x102f.google.com [IPv6:2607:f8b0:4864:20::102f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 112BF1724 for ; Wed, 17 May 2023 10:56:37 -0700 (PDT) Received: by mail-pj1-x102f.google.com with SMTP id 98e67ed59e1d1-2534d7abe8bso548054a91.3 for ; Wed, 17 May 2023 10:56:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1684346196; x=1686938196; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=M7mcSpdirUgkVBfG92gpr7cQ+vS0HR8NrlCwWzn4Gog=; b=gcQZDmkFG08yvkELFW4xN8gxSmJyJjqRESbBjoEgL1Ob0SDrbp3FQF7q7TfJF+CJaL tebkQ3qxSiUqKY4LToBbaB9Mgry/6EAtP9yO6WsgO0qrEKOySpCxbbfP5B9t4+dPdCSi waGsuYhPE103yImT6icBnheeO3Fc/Yw45wctWY+q5wKkYSPFi6eJS6d0UuIewWjTDCds XGs0iSibEMz+MDl6Tu54e9oZ2ikX36T33zshQKpDN6MEYnxzUog+khiW1cN7fNhJe4mJ IG7agdulC/6JLec5zkgAnGvDf3M/4NRjGv323K0GIC1WnRYQozTSEQHtDI7qAPaJ/Olq maFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684346196; x=1686938196; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=M7mcSpdirUgkVBfG92gpr7cQ+vS0HR8NrlCwWzn4Gog=; b=Vpu2Z19oM2/L/ILjYQyeUev068ION7umUWTs/cLohh8ywanMf6JkNTPw3/Zv89PjQt Wbp2fU50ILPXSz0+g7zTH23AZnaPTuvvwOI8+5iFWFq5RCf9WM/yNbuykQ0j2vymq+o4 LUe7v/zZ2Lcek7nFKFwNeixPJf+HmOL3AhtbxteucalROyC3ZI3aKN076qwRJQCMUHXh hI9PkugYZPSEp9tShg0NIPdF23oNPr0bVZ2MCGsZDDIZlMHFHNR4dmohhmwA0DzXq3IR Lvd0bVN2IDjGuKlNhyO33H+IWDxcxLZ3DhShMFG6+7Zwi9d5KTwMIzPkmuIcvJCOcclc DXyw== X-Gm-Message-State: AC+VfDyYN4Of+XvMAV1JlIj71/f6sLC9hctkBN/wc9ZcGem4FCjHHYwC yXHrum8R5VvVpj1YBVWIaWCnT8c7rmuI4oIHpRA= X-Google-Smtp-Source: ACHHUZ58Lh7rMwLbjtC/yj4o8Q2tgqFxavft3G+CVKWV4S0qHG4OTYGMmwuK3iKBaR1ocZCWzCoJHw== X-Received: by 2002:a17:90b:4a83:b0:253:3e9d:f925 with SMTP id lp3-20020a17090b4a8300b002533e9df925mr450054pjb.31.1684346196106; Wed, 17 May 2023 10:56:36 -0700 (PDT) Received: from localhost.localdomain ([2604:1380:4611:8100::1]) by smtp.gmail.com with ESMTPSA id 63-20020a630142000000b0051303d3e3c5sm15819857pgb.42.2023.05.17.10.56.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 May 2023 10:56:35 -0700 (PDT) From: Aditi Ghag To: bpf@vger.kernel.org Cc: kafai@fb.com, sdf@google.com, aditi.ghag@isovalent.com, Martin KaFai Lau Subject: [PATCH v8 bpf-next 03/10] bpf: udp: Encapsulate logic to get udp table Date: Wed, 17 May 2023 17:56:27 +0000 Message-Id: <20230517175627.528080-1-aditi.ghag@isovalent.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net This is a preparatory commit that encapsulates the logic to get udp table in iterator inside udp_get_table_afinfo, and renames the function to `udp_get_table_seq` accordingly. Suggested-by: Martin KaFai Lau Signed-off-by: Aditi Ghag --- net/ipv4/udp.c | 35 ++++++++++++----------------------- 1 file changed, 12 insertions(+), 23 deletions(-) diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 71e3fef44fd5..c426ebafeb13 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -2993,9 +2993,16 @@ static bool seq_sk_match(struct seq_file *seq, const struct sock *sk) net_eq(sock_net(sk), seq_file_net(seq))); } -static struct udp_table *udp_get_table_afinfo(struct udp_seq_afinfo *afinfo, - struct net *net) +static struct udp_table *udp_get_table_seq(struct seq_file *seq, + struct net *net) { + const struct udp_iter_state *state = seq->private; + const struct udp_seq_afinfo *afinfo; + + if (state->bpf_seq_afinfo) + return net->ipv4.udp_table; + + afinfo = pde_data(file_inode(seq->file)); return afinfo->udp_table ? : net->ipv4.udp_table; } @@ -3003,16 +3010,10 @@ static struct sock *udp_get_first(struct seq_file *seq, int start) { struct udp_iter_state *state = seq->private; struct net *net = seq_file_net(seq); - struct udp_seq_afinfo *afinfo; struct udp_table *udptable; struct sock *sk; - if (state->bpf_seq_afinfo) - afinfo = state->bpf_seq_afinfo; - else - afinfo = pde_data(file_inode(seq->file)); - - udptable = udp_get_table_afinfo(afinfo, net); + udptable = udp_get_table_seq(seq, net); for (state->bucket = start; state->bucket <= udptable->mask; ++state->bucket) { @@ -3037,20 +3038,14 @@ static struct sock *udp_get_next(struct seq_file *seq, struct sock *sk) { struct udp_iter_state *state = seq->private; struct net *net = seq_file_net(seq); - struct udp_seq_afinfo *afinfo; struct udp_table *udptable; - if (state->bpf_seq_afinfo) - afinfo = state->bpf_seq_afinfo; - else - afinfo = pde_data(file_inode(seq->file)); - do { sk = sk_next(sk); } while (sk && !seq_sk_match(seq, sk)); if (!sk) { - udptable = udp_get_table_afinfo(afinfo, net); + udptable = udp_get_table_seq(seq, net); if (state->bucket <= udptable->mask) spin_unlock_bh(&udptable->hash[state->bucket].lock); @@ -3096,15 +3091,9 @@ EXPORT_SYMBOL(udp_seq_next); void udp_seq_stop(struct seq_file *seq, void *v) { struct udp_iter_state *state = seq->private; - struct udp_seq_afinfo *afinfo; struct udp_table *udptable; - if (state->bpf_seq_afinfo) - afinfo = state->bpf_seq_afinfo; - else - afinfo = pde_data(file_inode(seq->file)); - - udptable = udp_get_table_afinfo(afinfo, seq_file_net(seq)); + udptable = udp_get_table_seq(seq, seq_file_net(seq)); if (state->bucket <= udptable->mask) spin_unlock_bh(&udptable->hash[state->bucket].lock); From patchwork Wed May 17 17:57:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aditi Ghag X-Patchwork-Id: 13245485 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D087910966 for ; Wed, 17 May 2023 17:57:18 +0000 (UTC) Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 90B0A10D0 for ; Wed, 17 May 2023 10:57:17 -0700 (PDT) Received: by mail-pl1-x632.google.com with SMTP id d9443c01a7336-1ae3bd3361dso2446505ad.1 for ; Wed, 17 May 2023 10:57:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1684346237; x=1686938237; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=gPFdfU1/FNcaeQ9qRdTfmPayMhEgRQuBsJmqrhB0EAo=; b=c9Y+83XIm+phnPsmYd2Mry8+eQPf7RTGvKZPSdI+i9Cu3VQY8H3LrGhI5lB67nFl5I Qmgo/l1XRilD21OuQC0PGU3VFrvWRucHf1xQWCjVxjwaqKOH53uX3je+fKIQB+c0isN8 uPrQUo0sEQ0lFVpmlbUNrqngwr9AuvQyoh+ua5E6rII8YSNREGEKQVL1aUnazOYFXYMG ZcmEKxILPe0wPmsBtMz1N64yb8dP3FmONIwOzsAyLmD4od5L7CzTjGrjXOGwx/xEdGsY mTRKTcEI7LnCtmcOsHvsgqDImdIJiClAwCrYgwwgqc2fVRrTDUyZVmTALjdRb5xzt0QT gaig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684346237; x=1686938237; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=gPFdfU1/FNcaeQ9qRdTfmPayMhEgRQuBsJmqrhB0EAo=; b=jOVCMkZKa/aMeQCKzqTxA66v8kFB5HJFMxGOSH4f9Pyt7TsbV9UBTF65wsYkunzXnB piBJvE5qFjzwNiz3yGoIM1MZagg4jrjCXh+LOoRpSpleegOBKrXYTRvWQDgUOc8VV/Od XVnLrvLvaI9i9FNLCdaCB6TQMykpqCl9u30fsZkSagCEE7f741ONOspo4TzyLrgfMdi/ cgVvYA28PkAKMHXAtyXvaCgL9zkN9RV0D7wdZ9KFsoJMFRj9g8GZWt5izlJ45PbkQK72 C6/En6OlSV3b/GCP6Ccrw5zlCkuujX69h3En1eXocpADX6eI3Dd1RARHU5Cp4Eo0Av/a QVOA== X-Gm-Message-State: AC+VfDzlcQrxirg5p1v3wm7/D1NqCTT5JhQZ2SqoLkHZiy4w81gu7KuQ 52Z96OROg/jgANVNIbbRP624An3rD0p6eXT36fw= X-Google-Smtp-Source: ACHHUZ5KKt26OondzSXmb6nwaFlVLy872GlIQuNvD32fLphhfyoW3x7RErAfR/fouBdyWR9EFmzkIQ== X-Received: by 2002:a17:902:c403:b0:1a9:4cd5:e7e0 with SMTP id k3-20020a170902c40300b001a94cd5e7e0mr4417578plk.17.1684346236797; Wed, 17 May 2023 10:57:16 -0700 (PDT) Received: from localhost.localdomain ([2604:1380:4611:8100::1]) by smtp.gmail.com with ESMTPSA id w10-20020a1709029a8a00b001a95c7742bbsm17979962plp.9.2023.05.17.10.57.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 May 2023 10:57:16 -0700 (PDT) From: Aditi Ghag To: bpf@vger.kernel.org Cc: kafai@fb.com, sdf@google.com, aditi.ghag@isovalent.com, Martin KaFai Lau Subject: [PATCH v8 bpf-next 04/10] udp: seq_file: Remove bpf_seq_afinfo from udp_iter_state Date: Wed, 17 May 2023 17:57:11 +0000 Message-Id: <20230517175711.528170-1-aditi.ghag@isovalent.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net This is a preparatory commit to remove the field. The field was previously shared between proc fs and BPF UDP socket iterators. As the follow-up commits will decouple the implementation for the iterators, remove the field. As for BPF socket iterator, filtering of sockets is exepected to be done in BPF programs. Suggested-by: Martin KaFai Lau Signed-off-by: Aditi Ghag --- include/net/udp.h | 1 - net/ipv4/udp.c | 27 +++++++-------------------- 2 files changed, 7 insertions(+), 21 deletions(-) diff --git a/include/net/udp.h b/include/net/udp.h index de4b528522bb..5cad44318d71 100644 --- a/include/net/udp.h +++ b/include/net/udp.h @@ -437,7 +437,6 @@ struct udp_seq_afinfo { struct udp_iter_state { struct seq_net_private p; int bucket; - struct udp_seq_afinfo *bpf_seq_afinfo; }; void *udp_seq_start(struct seq_file *seq, loff_t *pos); diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index c426ebafeb13..289ef05b5c15 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -2993,14 +2993,18 @@ static bool seq_sk_match(struct seq_file *seq, const struct sock *sk) net_eq(sock_net(sk), seq_file_net(seq))); } +#ifdef CONFIG_BPF_SYSCALL +static const struct seq_operations bpf_iter_udp_seq_ops; +#endif static struct udp_table *udp_get_table_seq(struct seq_file *seq, struct net *net) { - const struct udp_iter_state *state = seq->private; const struct udp_seq_afinfo *afinfo; - if (state->bpf_seq_afinfo) +#ifdef CONFIG_BPF_SYSCALL + if (seq->op == &bpf_iter_udp_seq_ops) return net->ipv4.udp_table; +#endif afinfo = pde_data(file_inode(seq->file)); return afinfo->udp_table ? : net->ipv4.udp_table; @@ -3424,28 +3428,11 @@ DEFINE_BPF_ITER_FUNC(udp, struct bpf_iter_meta *meta, static int bpf_iter_init_udp(void *priv_data, struct bpf_iter_aux_info *aux) { - struct udp_iter_state *st = priv_data; - struct udp_seq_afinfo *afinfo; - int ret; - - afinfo = kmalloc(sizeof(*afinfo), GFP_USER | __GFP_NOWARN); - if (!afinfo) - return -ENOMEM; - - afinfo->family = AF_UNSPEC; - afinfo->udp_table = NULL; - st->bpf_seq_afinfo = afinfo; - ret = bpf_iter_init_seq_net(priv_data, aux); - if (ret) - kfree(afinfo); - return ret; + return bpf_iter_init_seq_net(priv_data, aux); } static void bpf_iter_fini_udp(void *priv_data) { - struct udp_iter_state *st = priv_data; - - kfree(st->bpf_seq_afinfo); bpf_iter_fini_seq_net(priv_data); } From patchwork Wed May 17 17:57:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aditi Ghag X-Patchwork-Id: 13245486 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3EACF31EF1 for ; Wed, 17 May 2023 17:57:38 +0000 (UTC) Received: from mail-pg1-x529.google.com (mail-pg1-x529.google.com [IPv6:2607:f8b0:4864:20::529]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 015273A8D for ; Wed, 17 May 2023 10:57:30 -0700 (PDT) Received: by mail-pg1-x529.google.com with SMTP id 41be03b00d2f7-51b0f9d7d70so945855a12.1 for ; Wed, 17 May 2023 10:57:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1684346250; x=1686938250; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=bacljcxrDeAgSZeFU8xK+6nN1o9Ylg1NYfYtT1T+joA=; b=R5itWy7pnKmpcxXYDRmuuf/Ce18CIfcnSWmylOa5SMs8HjOiW6J+81nDsMhtPENkCg vIaarG4s6GQL60MVxZCMwFh5iXiFf69QfoeBPYzjNQdrOiorMmLaCrZHDYR6ZvSvSQqU CHO27SWL9JvuA3xiruW7oX3h47jgSC9g5del5kmvfysxMT4CLQ8WusUX+7M22BY6NE13 c/8Jxjok+T2IumyXviB/dZ0v16kLYRwlHjooSyl8oynDcQfJv/SQJ8q8pmm0Pl7i1nmS 5K3netNVC7bURk07V2ADkkAUrCcbQE7NMVsg45kvDv8pdFGNVC1+Hhv20b9uVzUwnxkL LHYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684346250; x=1686938250; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=bacljcxrDeAgSZeFU8xK+6nN1o9Ylg1NYfYtT1T+joA=; b=fWeeI0x20bGwhGGq4ckhmB7J/orY7hqGB8mo8P5lEOVayyHPRrT5AzZFwpkhAQCW/0 JDEU264Tr3v3uLr6JSONZlsK3SueC8qx6eo9IPVKNmEqI7Smnz71ac/zrLJ/2dwcpMBW VViJ4eETg+EEtkWzcLFCCLj+g7byfitFIyk/UYP6XXQrQmfQK1MRJiMjxuRUo49OmbAf lcqHzp4VVC8MZ14Mz3qcCNkVvMYzhySpOJAvuMMFopiQZTAL8Cr4/MbS1W9WJlZ2n6IQ hLGwUVZMebaTuMDpK+0+lniV4MDFFJniSZ1O6PHGMDP9ZFnj3MMw/c2uZPAHFRBIr4mV ye3g== X-Gm-Message-State: AC+VfDwijgLYqJxp14Ic4Ch64/Qx3G9k4pvSOcgX9lTf4M0ULoBzdS8k xX0SyekqCGIemyyZaMBI4LZGNk8EwwTUB3K7yTw= X-Google-Smtp-Source: ACHHUZ7Y8r/zZ4TFvaEJDBLAixdkk436r3GBQtZOWS4IAbeA2a28ArDBjBEpDUykzvdEd4cVcLaQoA== X-Received: by 2002:a17:90b:2388:b0:24e:1215:c280 with SMTP id mr8-20020a17090b238800b0024e1215c280mr404872pjb.45.1684346250058; Wed, 17 May 2023 10:57:30 -0700 (PDT) Received: from localhost.localdomain ([2604:1380:4611:8100::1]) by smtp.gmail.com with ESMTPSA id hg14-20020a17090b300e00b0025043a8185dsm1882855pjb.23.2023.05.17.10.57.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 May 2023 10:57:29 -0700 (PDT) From: Aditi Ghag To: bpf@vger.kernel.org Cc: kafai@fb.com, sdf@google.com, aditi.ghag@isovalent.com, Martin KaFai Lau Subject: [PATCH v8 bpf-next 05/10] bpf: udp: Implement batching for sockets iterator Date: Wed, 17 May 2023 17:57:25 +0000 Message-Id: <20230517175725.528192-1-aditi.ghag@isovalent.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net Batch UDP sockets from BPF iterator that allows for overlapping locking semantics in BPF/kernel helpers executed in BPF programs. This facilitates BPF socket destroy kfunc (introduced by follow-up patches) to execute from BPF iterator programs. Previously, BPF iterators acquired the sock lock and sockets hash table bucket lock while executing BPF programs. This prevented BPF helpers that again acquire these locks to be executed from BPF iterators. With the batching approach, we acquire a bucket lock, batch all the bucket sockets, and then release the bucket lock. This enables BPF or kernel helpers to skip sock locking when invoked in the supported BPF contexts. The batching logic is similar to the logic implemented in TCP iterator: https://lore.kernel.org/bpf/20210701200613.1036157-1-kafai@fb.com/. Suggested-by: Martin KaFai Lau Signed-off-by: Aditi Ghag --- net/ipv4/udp.c | 205 +++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 199 insertions(+), 6 deletions(-) diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 289ef05b5c15..8fe2fd6255cc 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -3150,6 +3150,143 @@ struct bpf_iter__udp { int bucket __aligned(8); }; +struct bpf_udp_iter_state { + struct udp_iter_state state; + unsigned int cur_sk; + unsigned int end_sk; + unsigned int max_sk; + int offset; + struct sock **batch; + bool st_bucket_done; +}; + +static int bpf_iter_udp_realloc_batch(struct bpf_udp_iter_state *iter, + unsigned int new_batch_sz); +static struct sock *bpf_iter_udp_batch(struct seq_file *seq) +{ + struct bpf_udp_iter_state *iter = seq->private; + struct udp_iter_state *state = &iter->state; + struct net *net = seq_file_net(seq); + struct udp_table *udptable; + unsigned int batch_sks = 0; + bool resized = false; + struct sock *sk; + + /* The current batch is done, so advance the bucket. */ + if (iter->st_bucket_done) { + state->bucket++; + iter->offset = 0; + } + + udptable = udp_get_table_seq(seq, net); + +again: + /* New batch for the next bucket. + * Iterate over the hash table to find a bucket with sockets matching + * the iterator attributes, and return the first matching socket from + * the bucket. The remaining matched sockets from the bucket are batched + * before releasing the bucket lock. This allows BPF programs that are + * called in seq_show to acquire the bucket lock if needed. + */ + iter->cur_sk = 0; + iter->end_sk = 0; + iter->st_bucket_done = false; + batch_sks = 0; + + for (; state->bucket <= udptable->mask; state->bucket++) { + struct udp_hslot *hslot2 = &udptable->hash2[state->bucket]; + + if (hlist_empty(&hslot2->head)) { + iter->offset = 0; + continue; + } + + spin_lock_bh(&hslot2->lock); + udp_portaddr_for_each_entry(sk, &hslot2->head) { + if (seq_sk_match(seq, sk)) { + /* Resume from the last iterated socket at the + * offset in the bucket before iterator was stopped. + */ + if (iter->offset) { + --iter->offset; + continue; + } + if (iter->end_sk < iter->max_sk) { + sock_hold(sk); + iter->batch[iter->end_sk++] = sk; + } + batch_sks++; + } + } + spin_unlock_bh(&hslot2->lock); + + if (iter->end_sk) + break; + + /* Reset the current bucket's offset before moving to the next bucket. */ + iter->offset = 0; + } + + /* All done: no batch made. */ + if (!iter->end_sk) + return NULL; + + if (iter->end_sk == batch_sks) { + /* Batching is done for the current bucket; return the first + * socket to be iterated from the batch. + */ + iter->st_bucket_done = true; + goto done; + } + if (!resized && !bpf_iter_udp_realloc_batch(iter, batch_sks * 3 / 2)) { + resized = true; + /* After allocating a larger batch, retry one more time to grab + * the whole bucket. + */ + state->bucket--; + goto again; + } +done: + return iter->batch[0]; +} + +static void *bpf_iter_udp_seq_next(struct seq_file *seq, void *v, loff_t *pos) +{ + struct bpf_udp_iter_state *iter = seq->private; + struct sock *sk; + + /* Whenever seq_next() is called, the iter->cur_sk is + * done with seq_show(), so unref the iter->cur_sk. + */ + if (iter->cur_sk < iter->end_sk) { + sock_put(iter->batch[iter->cur_sk++]); + ++iter->offset; + } + + /* After updating iter->cur_sk, check if there are more sockets + * available in the current bucket batch. + */ + if (iter->cur_sk < iter->end_sk) + sk = iter->batch[iter->cur_sk]; + else + /* Prepare a new batch. */ + sk = bpf_iter_udp_batch(seq); + + ++*pos; + return sk; +} + +static void *bpf_iter_udp_seq_start(struct seq_file *seq, loff_t *pos) +{ + /* bpf iter does not support lseek, so it always + * continue from where it was stop()-ped. + */ + if (*pos) + return bpf_iter_udp_batch(seq); + + return SEQ_START_TOKEN; +} + static int udp_prog_seq_show(struct bpf_prog *prog, struct bpf_iter_meta *meta, struct udp_sock *udp_sk, uid_t uid, int bucket) { @@ -3170,18 +3307,37 @@ static int bpf_iter_udp_seq_show(struct seq_file *seq, void *v) struct bpf_prog *prog; struct sock *sk = v; uid_t uid; + int ret; if (v == SEQ_START_TOKEN) return 0; + lock_sock(sk); + + if (unlikely(sk_unhashed(sk))) { + ret = SEQ_SKIP; + goto unlock; + } + uid = from_kuid_munged(seq_user_ns(seq), sock_i_uid(sk)); meta.seq = seq; prog = bpf_iter_get_info(&meta, false); - return udp_prog_seq_show(prog, &meta, v, uid, state->bucket); + ret = udp_prog_seq_show(prog, &meta, v, uid, state->bucket); + +unlock: + release_sock(sk); + return ret; +} + +static void bpf_iter_udp_put_batch(struct bpf_udp_iter_state *iter) +{ + while (iter->cur_sk < iter->end_sk) + sock_put(iter->batch[iter->cur_sk++]); } static void bpf_iter_udp_seq_stop(struct seq_file *seq, void *v) { + struct bpf_udp_iter_state *iter = seq->private; struct bpf_iter_meta meta; struct bpf_prog *prog; @@ -3192,12 +3348,15 @@ static void bpf_iter_udp_seq_stop(struct seq_file *seq, void *v) (void)udp_prog_seq_show(prog, &meta, v, 0, 0); } - udp_seq_stop(seq, v); + if (iter->cur_sk < iter->end_sk) { + bpf_iter_udp_put_batch(iter); + iter->st_bucket_done = false; + } } static const struct seq_operations bpf_iter_udp_seq_ops = { - .start = udp_seq_start, - .next = udp_seq_next, + .start = bpf_iter_udp_seq_start, + .next = bpf_iter_udp_seq_next, .stop = bpf_iter_udp_seq_stop, .show = bpf_iter_udp_seq_show, }; @@ -3426,21 +3585,55 @@ static struct pernet_operations __net_initdata udp_sysctl_ops = { DEFINE_BPF_ITER_FUNC(udp, struct bpf_iter_meta *meta, struct udp_sock *udp_sk, uid_t uid, int bucket) +static int bpf_iter_udp_realloc_batch(struct bpf_udp_iter_state *iter, + unsigned int new_batch_sz) +{ + struct sock **new_batch; + + new_batch = kvmalloc_array(new_batch_sz, sizeof(*new_batch), + GFP_USER | __GFP_NOWARN); + if (!new_batch) + return -ENOMEM; + + bpf_iter_udp_put_batch(iter); + kvfree(iter->batch); + iter->batch = new_batch; + iter->max_sk = new_batch_sz; + + return 0; +} + +#define INIT_BATCH_SZ 16 + static int bpf_iter_init_udp(void *priv_data, struct bpf_iter_aux_info *aux) { - return bpf_iter_init_seq_net(priv_data, aux); + struct bpf_udp_iter_state *iter = priv_data; + int ret; + + ret = bpf_iter_init_seq_net(priv_data, aux); + if (ret) + return ret; + + ret = bpf_iter_udp_realloc_batch(iter, INIT_BATCH_SZ); + if (ret) + bpf_iter_fini_seq_net(priv_data); + + return ret; } static void bpf_iter_fini_udp(void *priv_data) { + struct bpf_udp_iter_state *iter = priv_data; + bpf_iter_fini_seq_net(priv_data); + kvfree(iter->batch); } static const struct bpf_iter_seq_info udp_seq_info = { .seq_ops = &bpf_iter_udp_seq_ops, .init_seq_private = bpf_iter_init_udp, .fini_seq_private = bpf_iter_fini_udp, - .seq_priv_size = sizeof(struct udp_iter_state), + .seq_priv_size = sizeof(struct bpf_udp_iter_state), }; static struct bpf_iter_reg udp_reg_info = { From patchwork Wed May 17 17:57:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aditi Ghag X-Patchwork-Id: 13245487 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EBE2110966 for ; Wed, 17 May 2023 17:57:48 +0000 (UTC) Received: from mail-pf1-x429.google.com (mail-pf1-x429.google.com [IPv6:2607:f8b0:4864:20::429]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 68FC1213B for ; Wed, 17 May 2023 10:57:47 -0700 (PDT) Received: by mail-pf1-x429.google.com with SMTP id d2e1a72fcca58-643aad3bc41so1147282b3a.0 for ; Wed, 17 May 2023 10:57:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1684346266; x=1686938266; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=oHAE+MZD2cHQQT4RbXcuwbLoGF9QyMBpdo0iXgu354g=; b=MTH2GObeCAjr2qNLbhzkyFBBvSW8pBDfMF/S/2wl3YDNS7YRucWm3AEAd85tUQ99jz /3HjfzmYACG0Fwsv2xgHhemS3QK6LtfTisXGrnzdNPW3VyBRHBKGL7ikCiiFYbdgJrCJ soERkASDm+tfF4FO0euzCoZC+asSXda28ZcZdktAffxj0W3xho8Rg6dfc7jJOenL2ocr hI7hfR+KdTd6RTa7Kyvzv2bL6fYOr13HOlLqe8fFjrSRrAvON/qGgXMwEtufHl2JxqML egfpSdo0mqH8pB/Uug39FU0jm8yPbwviFDiFaUzuH8rkSsQ1o9u0GlPCFBw03aXCIHk2 1EUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684346266; x=1686938266; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=oHAE+MZD2cHQQT4RbXcuwbLoGF9QyMBpdo0iXgu354g=; b=lj8u7hs26+pcDLDlheWDrfcOLd5KqB1aevaI2pIKDBx0V3ciYxKnKEcyURbWMnXBzJ LMMNgw0ywgwi2Yj4coeZRfrJcRdUaj7/yOlKWzP7KMRgmUBETQaYSJVpZnpuoAuwO296 6UK5awQoLyJuYEfI20rw5QRaU693gpIJYc6AitU7Rnq5F9lpF8y+WmFPL48wXzJpYZad Frmj/5E4K249jHuJd0s8U0n9BglA4J0EcvOY2HonhULLsBd5nNbIp6EZxBSTsadQQp0J bVIeKUj5nCwE5+DxnzQ4PjWeI7YkLEFrcX4zkzwgF9ChwsxIkQIgVR0wiqr8eWDpcV7c Ghjw== X-Gm-Message-State: AC+VfDxVM1nzUJm06f8uxH+0OTGBI/gm5GGEd4x8dAL29hBkUcAPJH5R hViYuyHKb976mUJ0FTRqunr7Q+Fh9pRe9LjM49s= X-Google-Smtp-Source: ACHHUZ4pdy5FM38xlrlXiFW0oQmiBMu+IX7tUVcTIl3Nl8DELJF8uHPJK7wdIr6LthfORjrQgutpeQ== X-Received: by 2002:a05:6a00:1949:b0:64a:6cad:d840 with SMTP id s9-20020a056a00194900b0064a6cadd840mr599155pfk.25.1684346266659; Wed, 17 May 2023 10:57:46 -0700 (PDT) Received: from localhost.localdomain ([2604:1380:4611:8100::1]) by smtp.gmail.com with ESMTPSA id j23-20020aa783d7000000b0063b488f3305sm10738943pfn.155.2023.05.17.10.57.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 May 2023 10:57:46 -0700 (PDT) From: Aditi Ghag To: bpf@vger.kernel.org Cc: kafai@fb.com, sdf@google.com, aditi.ghag@isovalent.com Subject: [PATCH v8 bpf-next 06/10] bpf: Add bpf_sock_destroy kfunc Date: Wed, 17 May 2023 17:57:41 +0000 Message-Id: <20230517175741.528212-1-aditi.ghag@isovalent.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net The socket destroy kfunc is used to forcefully terminate sockets from certain BPF contexts. We plan to use the capability in Cilium load-balancing to terminate client sockets that continue to connect to deleted backends. The other use case is on-the-fly policy enforcement where existing socket connections prevented by policies need to be forcefully terminated. The kfunc also allows terminating sockets that may or may not be actively sending traffic. The kfunc is currently exposed to BPF TCP and UDP iterators where users can filter, and terminate selected sockets. Additionally, it can only be called from these BPF contexts that ensure socket locking in order to allow synchronous execution of protocol specific diag_destroy handlers. The previous commit that batches UDP sockets during iteration facilitated a synchronous invocation of the UDP destroy callback from BPF context by skipping socket locks in `udp_abort`. TCP iterator already supported batching of sockets being iterated. Follow-up commits will ensure that the kfunc can only be called from such programs with `BPF_TRACE_ITER` attach type. The kfunc takes `sock_common` type argument, even though it expects, and casts them to a `sock` pointer. This enables the verifier to allow the sock_destroy kfunc to be called for TCP with `sock_common` and UDP with `sock` structs. Furthermore, as `sock_common` only has a subset of certain fields of `sock`, casting pointer to the latter type might not always be safe for certain sockets like request sockets, but these have a special handling in the diag_destroy handlers. Additionally, the kfunc is defined with `KF_TRUSTED_ARGS` flag to avoid the cases where a `PTR_TO_BTF_ID` sk is obtained by following another pointer. eg. getting a sk pointer (may be even NULL) by following another sk pointer. The pointer socket argument passed in TCP and UDP iterators is tagged as `PTR_TRUSTED` in {tcp,udp}_reg_info. The TRUSTED arg changes are contributed by Martin KaFai Lau . Signed-off-by: Aditi Ghag --- net/core/filter.c | 54 +++++++++++++++++++++++++++++++++++++++++++++ net/ipv4/tcp.c | 9 +++++--- net/ipv4/tcp_ipv4.c | 2 +- net/ipv4/udp.c | 8 ++++--- 4 files changed, 66 insertions(+), 7 deletions(-) diff --git a/net/core/filter.c b/net/core/filter.c index 727c5269867d..0be10f6556df 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -11715,3 +11715,57 @@ static int __init bpf_kfunc_init(void) return ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP, &bpf_kfunc_set_xdp); } late_initcall(bpf_kfunc_init); + +/* Disables missing prototype warnings */ +__diag_push(); +__diag_ignore_all("-Wmissing-prototypes", + "Global functions as their definitions will be in vmlinux BTF"); + +/* bpf_sock_destroy: Destroy the given socket with ECONNABORTED error code. + * + * The function expects a non-NULL pointer to a socket, and invokes the + * protocol specific socket destroy handlers. + * + * The helper can only be called from BPF contexts that have acquired the socket + * locks. + * + * Parameters: + * @sock: Pointer to socket to be destroyed + * + * Return: + * On error, may return EPROTONOSUPPORT, EINVAL. + * EPROTONOSUPPORT if protocol specific destroy handler is not supported. + * 0 otherwise + */ +__bpf_kfunc int bpf_sock_destroy(struct sock_common *sock) +{ + struct sock *sk = (struct sock *)sock; + + /* The locking semantics that allow for synchronous execution of the + * destroy handlers are only supported for TCP and UDP. + * Supporting protocols will need to acquire sock lock in the BPF context + * prior to invoking this kfunc. + */ + if (!sk->sk_prot->diag_destroy || (sk->sk_protocol != IPPROTO_TCP && + sk->sk_protocol != IPPROTO_UDP)) + return -EOPNOTSUPP; + + return sk->sk_prot->diag_destroy(sk, ECONNABORTED); +} + +__diag_pop() + +BTF_SET8_START(bpf_sk_iter_check_kfunc_set) +BTF_ID_FLAGS(func, bpf_sock_destroy, KF_TRUSTED_ARGS) +BTF_SET8_END(bpf_sk_iter_check_kfunc_set) + +static const struct btf_kfunc_id_set bpf_sk_iter_kfunc_set = { + .owner = THIS_MODULE, + .set = &bpf_sk_iter_check_kfunc_set, +}; + +static int init_subsystem(void) +{ + return register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &bpf_sk_iter_kfunc_set); +} +late_initcall(init_subsystem); diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 288693981b00..fd41fdc09211 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -4679,8 +4679,10 @@ int tcp_abort(struct sock *sk, int err) return 0; } - /* Don't race with userspace socket closes such as tcp_close. */ - lock_sock(sk); + /* BPF context ensures sock locking. */ + if (!has_current_bpf_ctx()) + /* Don't race with userspace socket closes such as tcp_close. */ + lock_sock(sk); if (sk->sk_state == TCP_LISTEN) { tcp_set_state(sk, TCP_CLOSE); @@ -4704,7 +4706,8 @@ int tcp_abort(struct sock *sk, int err) bh_unlock_sock(sk); local_bh_enable(); tcp_write_queue_purge(sk); - release_sock(sk); + if (!has_current_bpf_ctx()) + release_sock(sk); return 0; } EXPORT_SYMBOL_GPL(tcp_abort); diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index f2d370a9450f..af75ddcbee62 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -3354,7 +3354,7 @@ static struct bpf_iter_reg tcp_reg_info = { .ctx_arg_info_size = 1, .ctx_arg_info = { { offsetof(struct bpf_iter__tcp, sk_common), - PTR_TO_BTF_ID_OR_NULL }, + PTR_TO_BTF_ID_OR_NULL | PTR_TRUSTED}, }, .get_func_proto = bpf_iter_tcp_get_func_proto, .seq_info = &tcp_seq_info, diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 8fe2fd6255cc..289fbbec633e 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -2925,7 +2925,8 @@ EXPORT_SYMBOL(udp_poll); int udp_abort(struct sock *sk, int err) { - lock_sock(sk); + if (!has_current_bpf_ctx()) + lock_sock(sk); /* udp{v6}_destroy_sock() sets it under the sk lock, avoid racing * with close() @@ -2938,7 +2939,8 @@ int udp_abort(struct sock *sk, int err) __udp_disconnect(sk, 0); out: - release_sock(sk); + if (!has_current_bpf_ctx()) + release_sock(sk); return 0; } @@ -3641,7 +3643,7 @@ static struct bpf_iter_reg udp_reg_info = { .ctx_arg_info_size = 1, .ctx_arg_info = { { offsetof(struct bpf_iter__udp, udp_sk), - PTR_TO_BTF_ID_OR_NULL }, + PTR_TO_BTF_ID_OR_NULL | PTR_TRUSTED }, }, .seq_info = &udp_seq_info, }; From patchwork Wed May 17 17:57:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aditi Ghag X-Patchwork-Id: 13245488 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D1C3410966 for ; Wed, 17 May 2023 17:58:01 +0000 (UTC) Received: from mail-pf1-x436.google.com (mail-pf1-x436.google.com [IPv6:2607:f8b0:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A634D10D0 for ; Wed, 17 May 2023 10:58:00 -0700 (PDT) Received: by mail-pf1-x436.google.com with SMTP id d2e1a72fcca58-64cb307d91aso1095233b3a.3 for ; Wed, 17 May 2023 10:58:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1684346280; x=1686938280; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=ZN/K8qzR/Ok4MAMPOdCy9FZws89Kf6N5NS+z8C935+Y=; b=ha4n6y6bDxyu6NmIWacOdOQnuf1wrKR2xDnpyy3tJ9gkCNrmWR9dOBSaj4S2MaxbdQ Mpsh3C6oQoSXZ4UIDKk3HH2uzHzzdb5ijTYgGNhPYVgwrDjl+PP5Vg7GuWsWsxYdIxF1 IuIN8kULt+HuNFEYe61YXm+Q2ivL5m2qY0OvI+BZLUYxIJvrhMc9whvAeolG5CBw/1QO uKev328NUBzza6S/sUxCrlaIPEws6nSIOPadQ5sHari6SE1CeXDNEc2NB0RhIhSkKjvD 8FD5EsIXrwLk3r9CaU2p3G/nOfe6B2DqHZ/XhktAIEPkeXG6kXhTy5/eVXF5J6LYZ791 8aaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684346280; x=1686938280; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ZN/K8qzR/Ok4MAMPOdCy9FZws89Kf6N5NS+z8C935+Y=; b=J2F3VgXiYI7Chsb5Mgr4D2uKq4+k3jZMuCDQmLEjuZHDm8w25/RTfnY2OV/sBQW/GG bRSzWrcDdJJAVBaxZ2PgsdBAPfPbbOP7gn/NwcaYXsooF7tD3LB0IM4NRAF5KOzPTfyy KNbJ/9SNy382+1hQGwdJzXEUiP40aKm3kwkefeYqkhUDtuf3e4L1L828ArBYuHHLVVvl joX2JUzTOC/AbaOhz1J41kCEqPqthG2xRGIapqii/l0+XXjOfn6dgGwLN+GG5Rm+CVp9 qBZ+YfAx7G2VeXMTfOoXuz7/NWSZ9YY7F22yVmIdhZYJYQELGaL9vJrfLRkR99NaspvH yc8w== X-Gm-Message-State: AC+VfDyjX26vw8Xt0Djc9ptYTosbLe/6/nIkLnycuoL09SReTOT66zoc 9qYLHWGUNpamRgH8+PO9JxzlAQ38F+en8UMXjwU= X-Google-Smtp-Source: ACHHUZ69cRTPoXj1szXbbZ6l9JOu3DWwud8yCaG7CR1UiMHlACe+Qj40+g9qAQHy3moPo5cDBwzsqg== X-Received: by 2002:a05:6a00:14c9:b0:646:2edb:a23 with SMTP id w9-20020a056a0014c900b006462edb0a23mr800054pfu.1.1684346279927; Wed, 17 May 2023 10:57:59 -0700 (PDT) Received: from localhost.localdomain ([2604:1380:4611:8100::1]) by smtp.gmail.com with ESMTPSA id c25-20020a62e819000000b0063f172b1c47sm9204638pfi.35.2023.05.17.10.57.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 May 2023 10:57:59 -0700 (PDT) From: Aditi Ghag To: bpf@vger.kernel.org Cc: kafai@fb.com, sdf@google.com, aditi.ghag@isovalent.com Subject: [PATCH v8 bpf-next 07/10] selftests/bpf: Add helper to get port using getsockname Date: Wed, 17 May 2023 17:57:54 +0000 Message-Id: <20230517175754.528242-1-aditi.ghag@isovalent.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net The helper will be used to programmatically retrieve and pass ports in userspace and kernel selftest programs. Suggested-by: Stanislav Fomichev Signed-off-by: Aditi Ghag --- tools/testing/selftests/bpf/network_helpers.c | 23 +++++++++++++++++++ tools/testing/selftests/bpf/network_helpers.h | 1 + 2 files changed, 24 insertions(+) diff --git a/tools/testing/selftests/bpf/network_helpers.c b/tools/testing/selftests/bpf/network_helpers.c index 596caa176582..a105c0cd008a 100644 --- a/tools/testing/selftests/bpf/network_helpers.c +++ b/tools/testing/selftests/bpf/network_helpers.c @@ -427,3 +427,26 @@ void close_netns(struct nstoken *token) close(token->orig_netns_fd); free(token); } + +int get_socket_local_port(int sock_fd) +{ + struct sockaddr_storage addr; + socklen_t addrlen = sizeof(addr); + int err; + + err = getsockname(sock_fd, (struct sockaddr *)&addr, &addrlen); + if (err < 0) + return err; + + if (addr.ss_family == AF_INET) { + struct sockaddr_in *sin = (struct sockaddr_in *)&addr; + + return sin->sin_port; + } else if (addr.ss_family == AF_INET6) { + struct sockaddr_in6 *sin = (struct sockaddr_in6 *)&addr; + + return sin->sin6_port; + } + + return -1; +} diff --git a/tools/testing/selftests/bpf/network_helpers.h b/tools/testing/selftests/bpf/network_helpers.h index f882c691b790..694185644da6 100644 --- a/tools/testing/selftests/bpf/network_helpers.h +++ b/tools/testing/selftests/bpf/network_helpers.h @@ -56,6 +56,7 @@ int fastopen_connect(int server_fd, const char *data, unsigned int data_len, int make_sockaddr(int family, const char *addr_str, __u16 port, struct sockaddr_storage *addr, socklen_t *len); char *ping_command(int family); +int get_socket_local_port(int sock_fd); struct nstoken; /** From patchwork Wed May 17 17:58:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aditi Ghag X-Patchwork-Id: 13245489 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E4EFA10966 for ; Wed, 17 May 2023 17:58:23 +0000 (UTC) Received: from mail-pf1-x436.google.com (mail-pf1-x436.google.com [IPv6:2607:f8b0:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A944A2D68 for ; Wed, 17 May 2023 10:58:21 -0700 (PDT) Received: by mail-pf1-x436.google.com with SMTP id d2e1a72fcca58-6439b410679so800834b3a.0 for ; Wed, 17 May 2023 10:58:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1684346301; x=1686938301; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=Ovx1si0O4MWoALi450Azt2XIcQPfuOphhCnBTn6nA4s=; b=glrz8jbf8ReKlLzZC42myNGJ+7dGPHSTvXJRls2OQv7DcZfSLDQs/8IqXZUjX1I2HB ujxGNNv6sO3nv2HUW82m99c9wpWWzBtQxoFzt1+zcpTpJcTmzJGQYWdnwB5gXQJQlsHC cZgTPsBazrv/lq8WM/aGiuOpn3RVBXOh0dwhEtfL2dh3Kx+lTmuFxeFHd+G0Nxg4x2j7 vsz8O8Xp9Vu80wgBg5haTKpBD4tEzTFax3yd1OMcCIVs5YJ5/8W8Qn2EL3cOyKAxlymH ubyaZl3+IJbGzZTXTkHkMu2kW7q1JMNwWk6eoDP12UD4JQkJSTqHd2mnFhRqAmepg4pf eVdg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684346301; x=1686938301; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Ovx1si0O4MWoALi450Azt2XIcQPfuOphhCnBTn6nA4s=; b=BHlyF6WvPmulihei3dIFdtTGFPWQWUiVfLkwYnVYL/nqEOiWV/xu2yOjEsSFxBlnMh Glx1fLcjZtwWV13LVXSKMp3K3XEpD/+o5byZQC4TOjrrpwFpdwal0Mk6xe7ZNMToVR3d EUtmG6dRQdRaIpzdWzILJsLyfi7CjfJu+go3a7RfxIk3fS3rkJD79kkbGrn5R3cCJBhJ 9mS2mzPpDEa2yqJtARvhXrr4Uk1wbcCl9a9AniqlUjnSHB5MKEn16+hIhWogUFLq2sCo Yr3z8aRzeYWROsi1onrvgJqPM88coLaQ4r1Pn9xBFMkxzyDhi41UOSbrlWXWmdsZycGI tzMQ== X-Gm-Message-State: AC+VfDxLIEKKlA7zx9p1M7DP4R+/dVvmu3XiIP1YnDG2z7fzsbfJS5l7 pmNp/eWASf/J0n2pNSn6a4Tpa2gFM1J1IQ5S1wQ= X-Google-Smtp-Source: ACHHUZ4zgBAwC+CUxSHJXiTAZf/xP9AXjH+t/t/ch3oblPLxItwqgW/7Y0iWH1Nm+F72uXa4FCXaxQ== X-Received: by 2002:a05:6a00:1582:b0:64b:20cd:6d52 with SMTP id u2-20020a056a00158200b0064b20cd6d52mr763012pfk.14.1684346300820; Wed, 17 May 2023 10:58:20 -0700 (PDT) Received: from localhost.localdomain ([2604:1380:4611:8100::1]) by smtp.gmail.com with ESMTPSA id a4-20020a62bd04000000b0063d47bfcdd5sm15501759pff.111.2023.05.17.10.58.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 May 2023 10:58:20 -0700 (PDT) From: Aditi Ghag To: bpf@vger.kernel.org Cc: kafai@fb.com, sdf@google.com, aditi.ghag@isovalent.com Subject: [PATCH v8 bpf-next 08/10] selftests/bpf: Test bpf_sock_destroy Date: Wed, 17 May 2023 17:58:16 +0000 Message-Id: <20230517175816.528276-1-aditi.ghag@isovalent.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net The test cases for destroying sockets mirror the intended usages of the bpf_sock_destroy kfunc using iterators. The destroy helpers set `ECONNABORTED` error code that we can validate in the test code with client sockets. But UDP sockets have an overriding error code from `disconnect()` called during abort, so the error code validation is only done for TCP sockets. Signed-off-by: Aditi Ghag --- .../selftests/bpf/prog_tests/sock_destroy.c | 219 ++++++++++++++++++ .../selftests/bpf/progs/sock_destroy_prog.c | 145 ++++++++++++ 2 files changed, 364 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/sock_destroy.c create mode 100644 tools/testing/selftests/bpf/progs/sock_destroy_prog.c diff --git a/tools/testing/selftests/bpf/prog_tests/sock_destroy.c b/tools/testing/selftests/bpf/prog_tests/sock_destroy.c new file mode 100644 index 000000000000..56b72594cd6b --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/sock_destroy.c @@ -0,0 +1,219 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include + +#include "sock_destroy_prog.skel.h" +#include "network_helpers.h" + +#define TEST_NS "sock_destroy_netns" + +static void start_iter_sockets(struct bpf_program *prog) +{ + struct bpf_link *link; + char buf[50] = {}; + int iter_fd, len; + + link = bpf_program__attach_iter(prog, NULL); + if (!ASSERT_OK_PTR(link, "attach_iter")) + return; + + iter_fd = bpf_iter_create(bpf_link__fd(link)); + if (!ASSERT_GE(iter_fd, 0, "create_iter")) + goto free_link; + + while ((len = read(iter_fd, buf, sizeof(buf))) > 0) + ; + ASSERT_GE(len, 0, "read"); + + close(iter_fd); + +free_link: + bpf_link__destroy(link); +} + +static void test_tcp_client(struct sock_destroy_prog *skel) +{ + int serv = -1, clien = -1, accept_serv = -1, n; + + serv = start_server(AF_INET6, SOCK_STREAM, NULL, 0, 0); + if (!ASSERT_GE(serv, 0, "start_server")) + goto cleanup; + + clien = connect_to_fd(serv, 0); + if (!ASSERT_GE(clien, 0, "connect_to_fd")) + goto cleanup; + + accept_serv = accept(serv, NULL, NULL); + if (!ASSERT_GE(accept_serv, 0, "serv accept")) + goto cleanup; + + n = send(clien, "t", 1, 0); + if (!ASSERT_EQ(n, 1, "client send")) + goto cleanup; + + /* Run iterator program that destroys connected client sockets. */ + start_iter_sockets(skel->progs.iter_tcp6_client); + + n = send(clien, "t", 1, 0); + if (!ASSERT_LT(n, 0, "client_send on destroyed socket")) + goto cleanup; + ASSERT_EQ(errno, ECONNABORTED, "error code on destroyed socket"); + +cleanup: + if (clien != -1) + close(clien); + if (accept_serv != -1) + close(accept_serv); + if (serv != -1) + close(serv); +} + +static void test_tcp_server(struct sock_destroy_prog *skel) +{ + int serv = -1, clien = -1, accept_serv = -1, n, serv_port; + + serv = start_server(AF_INET6, SOCK_STREAM, NULL, 0, 0); + if (!ASSERT_GE(serv, 0, "start_server")) + goto cleanup; + serv_port = get_socket_local_port(serv); + if (!ASSERT_GE(serv_port, 0, "get_sock_local_port")) + goto cleanup; + skel->bss->serv_port = (__be16) serv_port; + + clien = connect_to_fd(serv, 0); + if (!ASSERT_GE(clien, 0, "connect_to_fd")) + goto cleanup; + + accept_serv = accept(serv, NULL, NULL); + if (!ASSERT_GE(accept_serv, 0, "serv accept")) + goto cleanup; + + n = send(clien, "t", 1, 0); + if (!ASSERT_EQ(n, 1, "client send")) + goto cleanup; + + /* Run iterator program that destroys server sockets. */ + start_iter_sockets(skel->progs.iter_tcp6_server); + + n = send(clien, "t", 1, 0); + if (!ASSERT_LT(n, 0, "client_send on destroyed socket")) + goto cleanup; + ASSERT_EQ(errno, ECONNRESET, "error code on destroyed socket"); + +cleanup: + if (clien != -1) + close(clien); + if (accept_serv != -1) + close(accept_serv); + if (serv != -1) + close(serv); +} + +static void test_udp_client(struct sock_destroy_prog *skel) +{ + int serv = -1, clien = -1, n = 0; + + serv = start_server(AF_INET6, SOCK_DGRAM, NULL, 0, 0); + if (!ASSERT_GE(serv, 0, "start_server")) + goto cleanup; + + clien = connect_to_fd(serv, 0); + if (!ASSERT_GE(clien, 0, "connect_to_fd")) + goto cleanup; + + n = send(clien, "t", 1, 0); + if (!ASSERT_EQ(n, 1, "client send")) + goto cleanup; + + /* Run iterator program that destroys sockets. */ + start_iter_sockets(skel->progs.iter_udp6_client); + + n = send(clien, "t", 1, 0); + if (!ASSERT_LT(n, 0, "client_send on destroyed socket")) + goto cleanup; + /* UDP sockets have an overriding error code after they are disconnected, + * so we don't check for ECONNABORTED error code. + */ + +cleanup: + if (clien != -1) + close(clien); + if (serv != -1) + close(serv); +} + +static void test_udp_server(struct sock_destroy_prog *skel) +{ + int *listen_fds = NULL, n, i, serv_port; + unsigned int num_listens = 5; + char buf[1]; + + /* Start reuseport servers. */ + listen_fds = start_reuseport_server(AF_INET6, SOCK_DGRAM, + "::1", 0, 0, num_listens); + if (!ASSERT_OK_PTR(listen_fds, "start_reuseport_server")) + goto cleanup; + serv_port = get_socket_local_port(listen_fds[0]); + if (!ASSERT_GE(serv_port, 0, "get_sock_local_port")) + goto cleanup; + skel->bss->serv_port = (__be16) serv_port; + + /* Run iterator program that destroys server sockets. */ + start_iter_sockets(skel->progs.iter_udp6_server); + + for (i = 0; i < num_listens; ++i) { + n = read(listen_fds[i], buf, sizeof(buf)); + if (!ASSERT_EQ(n, -1, "read") || + !ASSERT_EQ(errno, ECONNABORTED, "error code on destroyed socket")) + break; + } + ASSERT_EQ(i, num_listens, "server socket"); + +cleanup: + free_fds(listen_fds, num_listens); +} + +void test_sock_destroy(void) +{ + struct sock_destroy_prog *skel; + struct nstoken *nstoken = NULL; + int cgroup_fd; + + skel = sock_destroy_prog__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_open")) + return; + + cgroup_fd = test__join_cgroup("/sock_destroy"); + if (!ASSERT_GE(cgroup_fd, 0, "join_cgroup")) + goto cleanup; + + skel->links.sock_connect = bpf_program__attach_cgroup( + skel->progs.sock_connect, cgroup_fd); + if (!ASSERT_OK_PTR(skel->links.sock_connect, "prog_attach")) + goto cleanup; + + SYS(cleanup, "ip netns add %s", TEST_NS); + SYS(cleanup, "ip -net %s link set dev lo up", TEST_NS); + + nstoken = open_netns(TEST_NS); + if (!ASSERT_OK_PTR(nstoken, "open_netns")) + goto cleanup; + + if (test__start_subtest("tcp_client")) + test_tcp_client(skel); + if (test__start_subtest("tcp_server")) + test_tcp_server(skel); + if (test__start_subtest("udp_client")) + test_udp_client(skel); + if (test__start_subtest("udp_server")) + test_udp_server(skel); + + +cleanup: + if (nstoken) + close_netns(nstoken); + SYS_NOFAIL("ip netns del " TEST_NS " &> /dev/null"); + if (cgroup_fd >= 0) + close(cgroup_fd); + sock_destroy_prog__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/progs/sock_destroy_prog.c b/tools/testing/selftests/bpf/progs/sock_destroy_prog.c new file mode 100644 index 000000000000..9e0bf7a54cec --- /dev/null +++ b/tools/testing/selftests/bpf/progs/sock_destroy_prog.c @@ -0,0 +1,145 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include "vmlinux.h" +#include +#include + +#include "bpf_tracing_net.h" + +__be16 serv_port = 0; + +int bpf_sock_destroy(struct sock_common *sk) __ksym; + +struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __uint(max_entries, 1); + __type(key, __u32); + __type(value, __u64); +} tcp_conn_sockets SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __uint(max_entries, 1); + __type(key, __u32); + __type(value, __u64); +} udp_conn_sockets SEC(".maps"); + +SEC("cgroup/connect6") +int sock_connect(struct bpf_sock_addr *ctx) +{ + __u64 sock_cookie = 0; + int key = 0; + __u32 keyc = 0; + + if (ctx->family != AF_INET6 || ctx->user_family != AF_INET6) + return 1; + + sock_cookie = bpf_get_socket_cookie(ctx); + if (ctx->protocol == IPPROTO_TCP) + bpf_map_update_elem(&tcp_conn_sockets, &key, &sock_cookie, 0); + else if (ctx->protocol == IPPROTO_UDP) + bpf_map_update_elem(&udp_conn_sockets, &keyc, &sock_cookie, 0); + else + return 1; + + return 1; +} + +SEC("iter/tcp") +int iter_tcp6_client(struct bpf_iter__tcp *ctx) +{ + struct sock_common *sk_common = ctx->sk_common; + __u64 sock_cookie = 0; + __u64 *val; + int key = 0; + + if (!sk_common) + return 0; + + if (sk_common->skc_family != AF_INET6) + return 0; + + sock_cookie = bpf_get_socket_cookie(sk_common); + val = bpf_map_lookup_elem(&tcp_conn_sockets, &key); + if (!val) + return 0; + /* Destroy connected client sockets. */ + if (sock_cookie == *val) + bpf_sock_destroy(sk_common); + + return 0; +} + +SEC("iter/tcp") +int iter_tcp6_server(struct bpf_iter__tcp *ctx) +{ + struct sock_common *sk_common = ctx->sk_common; + const struct inet_connection_sock *icsk; + const struct inet_sock *inet; + struct tcp6_sock *tcp_sk; + __be16 srcp; + + if (!sk_common) + return 0; + + if (sk_common->skc_family != AF_INET6) + return 0; + + tcp_sk = bpf_skc_to_tcp6_sock(sk_common); + if (!tcp_sk) + return 0; + + icsk = &tcp_sk->tcp.inet_conn; + inet = &icsk->icsk_inet; + srcp = inet->inet_sport; + + /* Destroy server sockets. */ + if (srcp == serv_port) + bpf_sock_destroy(sk_common); + + return 0; +} + + +SEC("iter/udp") +int iter_udp6_client(struct bpf_iter__udp *ctx) +{ + struct udp_sock *udp_sk = ctx->udp_sk; + struct sock *sk = (struct sock *) udp_sk; + __u64 sock_cookie = 0, *val; + int key = 0; + + if (!sk) + return 0; + + sock_cookie = bpf_get_socket_cookie(sk); + val = bpf_map_lookup_elem(&udp_conn_sockets, &key); + if (!val) + return 0; + /* Destroy connected client sockets. */ + if (sock_cookie == *val) + bpf_sock_destroy((struct sock_common *)sk); + + return 0; +} + +SEC("iter/udp") +int iter_udp6_server(struct bpf_iter__udp *ctx) +{ + struct udp_sock *udp_sk = ctx->udp_sk; + struct sock *sk = (struct sock *) udp_sk; + struct inet_sock *inet; + __be16 srcp; + + if (!sk) + return 0; + + inet = &udp_sk->inet; + srcp = inet->inet_sport; + if (srcp == serv_port) + bpf_sock_destroy((struct sock_common *)sk); + + return 0; +} + +char _license[] SEC("license") = "GPL"; From patchwork Wed May 17 17:59:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aditi Ghag X-Patchwork-Id: 13245504 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 40A0310966 for ; Wed, 17 May 2023 17:59:50 +0000 (UTC) Received: from mail-pl1-x633.google.com (mail-pl1-x633.google.com [IPv6:2607:f8b0:4864:20::633]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 78F03213B for ; Wed, 17 May 2023 10:59:48 -0700 (PDT) Received: by mail-pl1-x633.google.com with SMTP id d9443c01a7336-1ae3a5dfa42so8991135ad.0 for ; Wed, 17 May 2023 10:59:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1684346388; x=1686938388; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=jg0WCZ9Iqn+RY0bfCJQZxl/NIAZaJJtjp8wqjrC3QVo=; b=YvTsUOtVRHiRSOePgj5r0JKiYHscV+VNCoKuAu4ubK0fk6id3spA+kXQshSutR7wcA /F3QFjs//mp7UqyBRrAsHUMtvfXImpIGrddIF0aY9WiHHLggkID6m7M5h9oGVmgbDvra noa5p+oDjGcWZWZjt4lXLyQ6giGpYEi3IRPfW/GQl0QxCSJY/M7J6ZHxaQgiOjiBbbKN uhYYQ6LArNKQaSJh+cy8/jhtXlGFKasX+C6hJobJxmM0uAL4mgbHynLoQxzkuleEbntz fQTA6pQp5NmUrEoWAXO9VS2uSqZjOgO6olMqmtqTuwlGvJJcs67ejipuqZ8lb193gE9R zdqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684346388; x=1686938388; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=jg0WCZ9Iqn+RY0bfCJQZxl/NIAZaJJtjp8wqjrC3QVo=; b=JpUMPrh/DYbgUZwnsHS8ZaKzqJljOf8tQH3H4392xitBPmP62bJFfBvCD4PaMAfvtv Iw3xfC9uPnzX9ahPbqv/jUbi5O4dHPMfs+ceV2a9Im3ZCFH3pcu7+hLDVU4o9RVxki27 lRoUk1sokL0CI6rlHU+uhqOK42UVzg/7ObW5gdTwUuSk3s/Rh3o4Yig9fwpp3Q3iZgOS 4YePR0cX0fIa+x0tqfDDsjzX4TD/BlI5cpwbFtJnxy2QWrBWqOUi+nQoxmhFVNIIzRLU aghQJ8CW+t/NMD3nsxdB6DRI5MsjzyDTg0NkB5cm4AUY0NxCdn39Gmkzsbpi+lbJltn8 FH/g== X-Gm-Message-State: AC+VfDzHfQlYA3H/Iy3zeDPkbRj/HfonMUy0dbM+jyubSbkBXrjebEwO i53W2qlzgBCgpHXroVtLWnOhY6PnNBZ3bVqTOww= X-Google-Smtp-Source: ACHHUZ5ziiQtvleNrisQ6gY/2S33XpIQtSVDODt3j0loqBEQr5iQJbg7cNiAPWM+K8BIarCyq/zTAA== X-Received: by 2002:a17:902:e5cf:b0:1ac:807b:deb1 with SMTP id u15-20020a170902e5cf00b001ac807bdeb1mr39670588plf.38.1684346387671; Wed, 17 May 2023 10:59:47 -0700 (PDT) Received: from localhost.localdomain ([2604:1380:4611:8100::1]) by smtp.gmail.com with ESMTPSA id f11-20020a17090274cb00b001a6db2bef16sm17815989plt.303.2023.05.17.10.59.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 May 2023 10:59:47 -0700 (PDT) From: Aditi Ghag To: bpf@vger.kernel.org Cc: kafai@fb.com, sdf@google.com, void@manifault.com, aditi.ghag@isovalent.com, Martin KaFai Lau Subject: [PATCH v8 bpf-next 09/10] bpf: Add kfunc filter function to 'struct btf_kfunc_id_set' Date: Wed, 17 May 2023 17:59:42 +0000 Message-Id: <20230517175942.528375-1-aditi.ghag@isovalent.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net This commit adds the ability to filter kfuncs to certain BPF program types, and thereby limits bpf_sock_destroy kfunc to programs with attach type 'BPF_TRACE_ITER'. Previous commits introduced 'bpf_sock_destroy kfunc' that can only be called from BPF (sockets) iterator type programs. The reason being, the kfunc requires lock_sock to be done from the BPF context prior to calling the kfunc. To that end, the patch adds a callback filter to 'struct btf_kfunc_id_set'. The filter has access to the bpf_prog construct including other properties of the bpf_prog. For the bpf_sock_destroy case, the `expected_attached_type` property of a bpf_prog construct is used to allow access to the kfunc in the provided callback filter. Signed-off-by: Aditi Ghag Signed-off-by: Martin KaFai Lau --- include/linux/btf.h | 18 ++++++++----- kernel/bpf/btf.c | 59 +++++++++++++++++++++++++++++++++++-------- kernel/bpf/verifier.c | 7 ++--- net/core/filter.c | 9 +++++++ 4 files changed, 73 insertions(+), 20 deletions(-) diff --git a/include/linux/btf.h b/include/linux/btf.h index 495250162422..918a0b6379bd 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -99,10 +99,14 @@ struct btf_type; union bpf_attr; struct btf_show; struct btf_id_set; +struct bpf_prog; + +typedef int (*btf_kfunc_filter_t)(const struct bpf_prog *prog, u32 kfunc_id); struct btf_kfunc_id_set { struct module *owner; struct btf_id_set8 *set; + btf_kfunc_filter_t filter; }; struct btf_id_dtor_kfunc { @@ -482,7 +486,6 @@ static inline void *btf_id_set8_contains(const struct btf_id_set8 *set, u32 id) return bsearch(&id, set->pairs, set->cnt, sizeof(set->pairs[0]), btf_id_cmp_func); } -struct bpf_prog; struct bpf_verifier_log; #ifdef CONFIG_BPF_SYSCALL @@ -490,10 +493,10 @@ const struct btf_type *btf_type_by_id(const struct btf *btf, u32 type_id); const char *btf_name_by_offset(const struct btf *btf, u32 offset); struct btf *btf_parse_vmlinux(void); struct btf *bpf_prog_get_target_btf(const struct bpf_prog *prog); -u32 *btf_kfunc_id_set_contains(const struct btf *btf, - enum bpf_prog_type prog_type, - u32 kfunc_btf_id); -u32 *btf_kfunc_is_modify_return(const struct btf *btf, u32 kfunc_btf_id); +u32 *btf_kfunc_id_set_contains(const struct btf *btf, u32 kfunc_btf_id, + const struct bpf_prog *prog); +u32 *btf_kfunc_is_modify_return(const struct btf *btf, u32 kfunc_btf_id, + const struct bpf_prog *prog); int register_btf_kfunc_id_set(enum bpf_prog_type prog_type, const struct btf_kfunc_id_set *s); int register_btf_fmodret_id_set(const struct btf_kfunc_id_set *kset); @@ -520,8 +523,9 @@ static inline const char *btf_name_by_offset(const struct btf *btf, return NULL; } static inline u32 *btf_kfunc_id_set_contains(const struct btf *btf, - enum bpf_prog_type prog_type, - u32 kfunc_btf_id) + u32 kfunc_btf_id, + struct bpf_prog *prog) + { return NULL; } diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 913b9d717a4a..c6dae44e236d 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -218,10 +218,17 @@ enum btf_kfunc_hook { enum { BTF_KFUNC_SET_MAX_CNT = 256, BTF_DTOR_KFUNC_MAX_CNT = 256, + BTF_KFUNC_FILTER_MAX_CNT = 16, +}; + +struct btf_kfunc_hook_filter { + btf_kfunc_filter_t filters[BTF_KFUNC_FILTER_MAX_CNT]; + u32 nr_filters; }; struct btf_kfunc_set_tab { struct btf_id_set8 *sets[BTF_KFUNC_HOOK_MAX]; + struct btf_kfunc_hook_filter hook_filters[BTF_KFUNC_HOOK_MAX]; }; struct btf_id_dtor_kfunc_tab { @@ -7720,9 +7727,12 @@ static int btf_check_kfunc_protos(struct btf *btf, u32 func_id, u32 func_flags) /* Kernel Function (kfunc) BTF ID set registration API */ static int btf_populate_kfunc_set(struct btf *btf, enum btf_kfunc_hook hook, - struct btf_id_set8 *add_set) + const struct btf_kfunc_id_set *kset) { + struct btf_kfunc_hook_filter *hook_filter; + struct btf_id_set8 *add_set = kset->set; bool vmlinux_set = !btf_is_module(btf); + bool add_filter = !!kset->filter; struct btf_kfunc_set_tab *tab; struct btf_id_set8 *set; u32 set_cnt; @@ -7737,6 +7747,20 @@ static int btf_populate_kfunc_set(struct btf *btf, enum btf_kfunc_hook hook, return 0; tab = btf->kfunc_set_tab; + + if (tab && add_filter) { + int i; + + hook_filter = &tab->hook_filters[hook]; + for (i = 0; i < hook_filter->nr_filters; i++) { + if (hook_filter->filters[i] == kset->filter) + add_filter = false; + } + + if (add_filter && hook_filter->nr_filters == BTF_KFUNC_FILTER_MAX_CNT) + return -E2BIG; + } + if (!tab) { tab = kzalloc(sizeof(*tab), GFP_KERNEL | __GFP_NOWARN); if (!tab) @@ -7759,7 +7783,7 @@ static int btf_populate_kfunc_set(struct btf *btf, enum btf_kfunc_hook hook, */ if (!vmlinux_set) { tab->sets[hook] = add_set; - return 0; + goto do_add_filter; } /* In case of vmlinux sets, there may be more than one set being @@ -7801,6 +7825,11 @@ static int btf_populate_kfunc_set(struct btf *btf, enum btf_kfunc_hook hook, sort(set->pairs, set->cnt, sizeof(set->pairs[0]), btf_id_cmp_func, NULL); +do_add_filter: + if (add_filter) { + hook_filter = &tab->hook_filters[hook]; + hook_filter->filters[hook_filter->nr_filters++] = kset->filter; + } return 0; end: btf_free_kfunc_set_tab(btf); @@ -7809,15 +7838,22 @@ static int btf_populate_kfunc_set(struct btf *btf, enum btf_kfunc_hook hook, static u32 *__btf_kfunc_id_set_contains(const struct btf *btf, enum btf_kfunc_hook hook, + const struct bpf_prog *prog, u32 kfunc_btf_id) { + struct btf_kfunc_hook_filter *hook_filter; struct btf_id_set8 *set; - u32 *id; + u32 *id, i; if (hook >= BTF_KFUNC_HOOK_MAX) return NULL; if (!btf->kfunc_set_tab) return NULL; + hook_filter = &btf->kfunc_set_tab->hook_filters[hook]; + for (i = 0; i < hook_filter->nr_filters; i++) { + if (hook_filter->filters[i](prog, kfunc_btf_id)) + return NULL; + } set = btf->kfunc_set_tab->sets[hook]; if (!set) return NULL; @@ -7870,23 +7906,25 @@ static int bpf_prog_type_to_kfunc_hook(enum bpf_prog_type prog_type) * protection for looking up a well-formed btf->kfunc_set_tab. */ u32 *btf_kfunc_id_set_contains(const struct btf *btf, - enum bpf_prog_type prog_type, - u32 kfunc_btf_id) + u32 kfunc_btf_id, + const struct bpf_prog *prog) { + enum bpf_prog_type prog_type = resolve_prog_type(prog); enum btf_kfunc_hook hook; u32 *kfunc_flags; - kfunc_flags = __btf_kfunc_id_set_contains(btf, BTF_KFUNC_HOOK_COMMON, kfunc_btf_id); + kfunc_flags = __btf_kfunc_id_set_contains(btf, BTF_KFUNC_HOOK_COMMON, prog, kfunc_btf_id); if (kfunc_flags) return kfunc_flags; hook = bpf_prog_type_to_kfunc_hook(prog_type); - return __btf_kfunc_id_set_contains(btf, hook, kfunc_btf_id); + return __btf_kfunc_id_set_contains(btf, hook, prog, kfunc_btf_id); } -u32 *btf_kfunc_is_modify_return(const struct btf *btf, u32 kfunc_btf_id) +u32 *btf_kfunc_is_modify_return(const struct btf *btf, u32 kfunc_btf_id, + const struct bpf_prog *prog) { - return __btf_kfunc_id_set_contains(btf, BTF_KFUNC_HOOK_FMODRET, kfunc_btf_id); + return __btf_kfunc_id_set_contains(btf, BTF_KFUNC_HOOK_FMODRET, prog, kfunc_btf_id); } static int __register_btf_kfunc_id_set(enum btf_kfunc_hook hook, @@ -7917,7 +7955,8 @@ static int __register_btf_kfunc_id_set(enum btf_kfunc_hook hook, goto err_out; } - ret = btf_populate_kfunc_set(btf, hook, kset->set); + ret = btf_populate_kfunc_set(btf, hook, kset); + err_out: btf_put(btf); return ret; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index d6db6de3e9ea..8d9519210935 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -10534,7 +10534,7 @@ static int fetch_kfunc_meta(struct bpf_verifier_env *env, *kfunc_name = func_name; func_proto = btf_type_by_id(desc_btf, func->type); - kfunc_flags = btf_kfunc_id_set_contains(desc_btf, resolve_prog_type(env->prog), func_id); + kfunc_flags = btf_kfunc_id_set_contains(desc_btf, func_id, env->prog); if (!kfunc_flags) { return -EACCES; } @@ -18526,7 +18526,8 @@ int bpf_check_attach_target(struct bpf_verifier_log *log, * in the fmodret id set with the KF_SLEEPABLE flag. */ else { - u32 *flags = btf_kfunc_is_modify_return(btf, btf_id); + u32 *flags = btf_kfunc_is_modify_return(btf, btf_id, + prog); if (flags && (*flags & KF_SLEEPABLE)) ret = 0; @@ -18554,7 +18555,7 @@ int bpf_check_attach_target(struct bpf_verifier_log *log, return -EINVAL; } ret = -EINVAL; - if (btf_kfunc_is_modify_return(btf, btf_id) || + if (btf_kfunc_is_modify_return(btf, btf_id, prog) || !check_attach_modify_return(addr, tname)) ret = 0; if (ret) { diff --git a/net/core/filter.c b/net/core/filter.c index 0be10f6556df..efeac5d8f19c 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -11759,9 +11759,18 @@ BTF_SET8_START(bpf_sk_iter_check_kfunc_set) BTF_ID_FLAGS(func, bpf_sock_destroy, KF_TRUSTED_ARGS) BTF_SET8_END(bpf_sk_iter_check_kfunc_set) +static int tracing_iter_filter(const struct bpf_prog *prog, u32 kfunc_id) +{ + if (btf_id_set8_contains(&bpf_sk_iter_check_kfunc_set, kfunc_id) && + prog->expected_attach_type != BPF_TRACE_ITER) + return -EACCES; + return 0; +} + static const struct btf_kfunc_id_set bpf_sk_iter_kfunc_set = { .owner = THIS_MODULE, .set = &bpf_sk_iter_check_kfunc_set, + .filter = tracing_iter_filter, }; static int init_subsystem(void) From patchwork Wed May 17 18:00:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aditi Ghag X-Patchwork-Id: 13245505 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9438810966 for ; Wed, 17 May 2023 18:00:10 +0000 (UTC) Received: from mail-pj1-x1032.google.com (mail-pj1-x1032.google.com [IPv6:2607:f8b0:4864:20::1032]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3D4CA10D0 for ; Wed, 17 May 2023 11:00:09 -0700 (PDT) Received: by mail-pj1-x1032.google.com with SMTP id 98e67ed59e1d1-25344113e9bso780814a91.3 for ; Wed, 17 May 2023 11:00:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1684346408; x=1686938408; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=WftXfWdAZD6uuzC3b6AEEhK30JaZ2vZwm1NTFpU16cM=; b=eBh23dTe/Ys1ebPxpsqbha8jJQ2vOoAeVS9dRCrtjnLZGD7NfvQvloy4D5S22p7ZhB y1Yflo1zlkoxEHHMb9dPVLOSi9XqXCAjP9x/TMUd7sFxntv/jD9L2/dTp0syYr1pApWO /SiNHsjDjaPMoYLktSAUGE98YxP10QG9KHqhrL+L5M+KZMvikorm1vy4W/ooapeDuU0X zLKIWfR7wM6PJZLc9/2kWMZXUelvSfPZvZTAQLmq2CdJkfAiOnHT4JOeaViwoC5/LMHU G0MG0YAEeL729+03Kmwkjh9y+W4vFnq2rUsshsjiic51NI5B4Id5fLGSBwZ9oHYp8nsW +K0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684346408; x=1686938408; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=WftXfWdAZD6uuzC3b6AEEhK30JaZ2vZwm1NTFpU16cM=; b=jAITYcYNWR93m9Fvzx/K9JPrFvCmYKgHIwBleYPgKZbDfP30+I2Y2DMfmCDdxdqq34 yrrD0EHROcwRk1K04COzrnhfi+HJSb9B41bskwpPVppKX1YjRPAIBzjS3pBQzcwU6MJO 6ObTJ0/fwx8HePwRtBW+xfF1GbPtJWmzPE+zICXw3DY+u9pn5E1iSeT4G5dqsErH2Cp1 CjawGX7HPhKRamDUMaQHIuLWyumT1OTjV1tGMYvuB0t+0XTx1cklaASPX3lXZ2VrnV+B edcOmfwm8pLp0ZKdznZkCvImkuH/ssMaB1Lg/GrsLoDUNP6yL0+PiXiqkvUy54wIviKL XuXw== X-Gm-Message-State: AC+VfDzzQnPmA+NW1yCgkeenYfE3RsO4i+cNjlxiuXqXybIK/m5O+a9i 0l+FHpKXHeIzes+cb8DCATPUtM7/1mAeEAS2Suw= X-Google-Smtp-Source: ACHHUZ7yV43ijhzfhEPpytas8WrVS4hCOGfHpGmC3HG2yk/InGDsx6Xk699P30PXaQ4e787ceGdC5Q== X-Received: by 2002:a17:90a:2a06:b0:253:50d0:a39d with SMTP id i6-20020a17090a2a0600b0025350d0a39dmr367565pjd.48.1684346408501; Wed, 17 May 2023 11:00:08 -0700 (PDT) Received: from localhost.localdomain ([2604:1380:4611:8100::1]) by smtp.gmail.com with ESMTPSA id s12-20020a17090aba0c00b0025289bc1ce4sm1885702pjr.17.2023.05.17.11.00.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 May 2023 11:00:08 -0700 (PDT) From: Aditi Ghag To: bpf@vger.kernel.org Cc: kafai@fb.com, sdf@google.com, aditi.ghag@isovalent.com, Martin KaFai Lau Subject: [PATCH v8 bpf-next 10/10] selftests/bpf: Extend bpf_sock_destroy tests Date: Wed, 17 May 2023 18:00:03 +0000 Message-Id: <20230517180003.528401-1-aditi.ghag@isovalent.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net This commit adds a test case to verify that the `bpf_sock_destroy` kfunc is not allowed from program attach types other than BPF trace iterator. Unsupprted programs calling the kfunc will be rejected by the verifier. Signed-off-by: Aditi Ghag Signed-off-by: Martin KaFai Lau --- .../selftests/bpf/prog_tests/sock_destroy.c | 2 ++ .../bpf/progs/sock_destroy_prog_fail.c | 22 +++++++++++++++++++ 2 files changed, 24 insertions(+) create mode 100644 tools/testing/selftests/bpf/progs/sock_destroy_prog_fail.c diff --git a/tools/testing/selftests/bpf/prog_tests/sock_destroy.c b/tools/testing/selftests/bpf/prog_tests/sock_destroy.c index 56b72594cd6b..b0583309a94e 100644 --- a/tools/testing/selftests/bpf/prog_tests/sock_destroy.c +++ b/tools/testing/selftests/bpf/prog_tests/sock_destroy.c @@ -3,6 +3,7 @@ #include #include "sock_destroy_prog.skel.h" +#include "sock_destroy_prog_fail.skel.h" #include "network_helpers.h" #define TEST_NS "sock_destroy_netns" @@ -208,6 +209,7 @@ void test_sock_destroy(void) if (test__start_subtest("udp_server")) test_udp_server(skel); + RUN_TESTS(sock_destroy_prog_fail); cleanup: if (nstoken) diff --git a/tools/testing/selftests/bpf/progs/sock_destroy_prog_fail.c b/tools/testing/selftests/bpf/progs/sock_destroy_prog_fail.c new file mode 100644 index 000000000000..dd6850b58e25 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/sock_destroy_prog_fail.c @@ -0,0 +1,22 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include "vmlinux.h" +#include +#include + +#include "bpf_misc.h" + +char _license[] SEC("license") = "GPL"; + +int bpf_sock_destroy(struct sock_common *sk) __ksym; + +SEC("tp_btf/tcp_destroy_sock") +__failure __msg("calling kernel function bpf_sock_destroy is not allowed") +int BPF_PROG(trace_tcp_destroy_sock, struct sock *sk) +{ + /* should not load */ + bpf_sock_destroy((struct sock_common *)sk); + + return 0; +} +