mbox series

[v6,bpf-next,0/7] bpf: Add socket destroy capability

Message ID 20230418153148.2231644-1-aditi.ghag@isovalent.com (mailing list archive)
Headers show
Series bpf: Add socket destroy capability | expand

Message

Aditi Ghag April 18, 2023, 3:31 p.m. UTC
This patch adds the capability to destroy sockets in BPF. We plan to use
the capability in Cilium to force client sockets to reconnect when their
remote load-balancing backends are deleted. The other use case is
on-the-fly policy enforcement where existing socket connections prevented
by policies need to be terminated.

The use cases, and more details around
the selected approach were presented at LPC 2022 -
https://lpc.events/event/16/contributions/1358/.
RFC discussion -
https://lore.kernel.org/netdev/CABG=zsBEh-P4NXk23eBJw7eajB5YJeRS7oPXnTAzs=yob4EMoQ@mail.gmail.com/T/#u.
v5 patch series -
https://lore.kernel.org/bpf/20230330151758.531170-1-aditi.ghag@isovalent.com/

v6 highlights:
Address review comments:
Martin:
- Simplified UDP batching iterator logic.
- Run tests in a separate netns.
Stan:
- Make destroy handler checks opt-in.
- Extended network helper to get socket port for v4.
kernel test robot:
- Updated seq_sk_match and seq_file_family with the necessary ifdefs.

(Below notes are same as v5 patch series that are still relevant. Refer to
earlier patch series for other notes.)
- I hit a snag while writing the kfunc where verifier complained about the
  `sock_common` type passed from TCP iterator. With kfuncs, there don't
  seem to be any options available to pass BTF type hints to the verifier
  (equivalent of `ARG_PTR_TO_BTF_ID_SOCK_COMMON`, as was the case with the
  helper).  As a result, I changed the argument type of the sock_destory
  kfunc to `sock_common`.
- Martin's patch restricts the helper to only BPF iterators -
  https://lore.kernel.org/bpf/20230404060959.2259448-1-martin.lau@linux.dev/.
  Applied, and tested the patch locally.


Aditi Ghag (7):
  bpf: tcp: Avoid taking fast sock lock in iterator
  udp: seq_file: Remove bpf_seq_afinfo from udp_iter_state
  udp: seq_file: Helper function to match socket attributes
  bpf: udp: Implement batching for sockets iterator
  bpf: Add bpf_sock_destroy kfunc
  selftests/bpf: Add helper to get port using getsockname
  selftests/bpf: Test bpf_sock_destroy

 include/net/udp.h                             |   1 -
 net/core/filter.c                             |  57 ++++
 net/ipv4/tcp.c                                |  10 +-
 net/ipv4/tcp_ipv4.c                           |   5 +-
 net/ipv4/udp.c                                | 269 +++++++++++++++---
 tools/testing/selftests/bpf/network_helpers.c |  28 ++
 tools/testing/selftests/bpf/network_helpers.h |   1 +
 .../selftests/bpf/prog_tests/sock_destroy.c   | 217 ++++++++++++++
 .../selftests/bpf/progs/sock_destroy_prog.c   | 147 ++++++++++
 9 files changed, 690 insertions(+), 45 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/sock_destroy.c
 create mode 100644 tools/testing/selftests/bpf/progs/sock_destroy_prog.c

Comments

Martin KaFai Lau April 24, 2023, 10:15 p.m. UTC | #1
On 4/18/23 8:31 AM, Aditi Ghag wrote:
> This patch adds the capability to destroy sockets in BPF. We plan to use
> the capability in Cilium to force client sockets to reconnect when their
> remote load-balancing backends are deleted. The other use case is
> on-the-fly policy enforcement where existing socket connections prevented
> by policies need to be terminated.

If the earlier kfunc filter patch 
(https://lore.kernel.org/bpf/1ECC8AAA-C2E6-4F8A-B7D3-5E90BDEE7C48@isovalent.com/) 
looks fine to you, please include it into the next revision. This patchset needs 
it. Usual thing to do is to keep my sob (and author if not much has changed) and 
add your sob. The test needs to be broken out into a separate patch though. It 
needs to use the '__failure __msg("calling kernel function bpf_sock_destroy is 
not allowed")'. There are many examples in selftests, eg. the dynptr_fail.c.

Please also fix the subject in the patches. They are all missing the bpf-next 
and revision tag.
Aditi Ghag May 1, 2023, 11:32 p.m. UTC | #2
> On Apr 24, 2023, at 3:15 PM, Martin KaFai Lau <martin.lau@linux.dev> wrote:
> 
> On 4/18/23 8:31 AM, Aditi Ghag wrote:
>> This patch adds the capability to destroy sockets in BPF. We plan to use
>> the capability in Cilium to force client sockets to reconnect when their
>> remote load-balancing backends are deleted. The other use case is
>> on-the-fly policy enforcement where existing socket connections prevented
>> by policies need to be terminated.
> 
> If the earlier kfunc filter patch (https://lore.kernel.org/bpf/1ECC8AAA-C2E6-4F8A-B7D3-5E90BDEE7C48@isovalent.com/) looks fine to you, please include it into the next revision. This patchset needs it. Usual thing to do is to keep my sob (and author if not much has changed) and add your sob. The test needs to be broken out into a separate patch though. It needs to use the '__failure __msg("calling kernel function bpf_sock_destroy is not allowed")'. There are many examples in selftests, eg. the dynptr_fail.c.
> 

Yeah, ok. I was waiting for your confirmation. The patch doesn't need my sob though (maybe tested-by).
I've created a separate patch for the test. 


> Please also fix the subject in the patches. They are all missing the bpf-next and revision tag.
> 

Took me a few moments to realize that as I was looking at earlier series. Looks like I forgot to add the tags to subsequent patches in this series. I'll fix it up in the next push.
Aditi Ghag May 1, 2023, 11:37 p.m. UTC | #3
> On May 1, 2023, at 4:32 PM, Aditi Ghag <aditi.ghag@isovalent.com> wrote:
> 
> 
> 
>> On Apr 24, 2023, at 3:15 PM, Martin KaFai Lau <martin.lau@linux.dev> wrote:
>> 
>> On 4/18/23 8:31 AM, Aditi Ghag wrote:
>>> This patch adds the capability to destroy sockets in BPF. We plan to use
>>> the capability in Cilium to force client sockets to reconnect when their
>>> remote load-balancing backends are deleted. The other use case is
>>> on-the-fly policy enforcement where existing socket connections prevented
>>> by policies need to be terminated.
>> 
>> If the earlier kfunc filter patch (https://lore.kernel.org/bpf/1ECC8AAA-C2E6-4F8A-B7D3-5E90BDEE7C48@isovalent.com/) looks fine to you, please include it into the next revision. This patchset needs it. Usual thing to do is to keep my sob (and author if not much has changed) and add your sob. The test needs to be broken out into a separate patch though. It needs to use the '__failure __msg("calling kernel function bpf_sock_destroy is not allowed")'. There are many examples in selftests, eg. the dynptr_fail.c.
>> 
> 
> Yeah, ok. I was waiting for your confirmation. The patch doesn't need my sob though (maybe tested-by).
> I've created a separate patch for the test. 
> 

Ah, looks like the patch is missing a proper description. While I can add something wrt sock_destroy use case, if you have a blurb, feel free to post it here.

> 
>> Please also fix the subject in the patches. They are all missing the bpf-next and revision tag.
>> 
> 
> Took me a few moments to realize that as I was looking at earlier series. Looks like I forgot to add the tags to subsequent patches in this series. I'll fix it up in the next push.
Aditi Ghag May 2, 2023, 10:52 p.m. UTC | #4
> On May 1, 2023, at 4:32 PM, Aditi Ghag <aditi.ghag@isovalent.com> wrote:
> 
> 
> 
>> On Apr 24, 2023, at 3:15 PM, Martin KaFai Lau <martin.lau@linux.dev> wrote:
>> 
>> On 4/18/23 8:31 AM, Aditi Ghag wrote:
>>> This patch adds the capability to destroy sockets in BPF. We plan to use
>>> the capability in Cilium to force client sockets to reconnect when their
>>> remote load-balancing backends are deleted. The other use case is
>>> on-the-fly policy enforcement where existing socket connections prevented
>>> by policies need to be terminated.
>> 
>> If the earlier kfunc filter patch (https://lore.kernel.org/bpf/1ECC8AAA-C2E6-4F8A-B7D3-5E90BDEE7C48@isovalent.com/) looks fine to you, please include it into the next revision. This patchset needs it. Usual thing to do is to keep my sob (and author if not much has changed) and add your sob. The test needs to be broken out into a separate patch though. It needs to use the '__failure __msg("calling kernel function bpf_sock_destroy is not allowed")'. There are many examples in selftests, eg. the dynptr_fail.c.
>> 
> 
> Yeah, ok. I was waiting for your confirmation. The patch doesn't need my sob though (maybe tested-by).
> I've created a separate patch for the test. 


Here is the patch diff for the extended test case for your reference. I'm ready to push a new version once I get an ack from you. 

diff --git a/tools/testing/selftests/bpf/prog_tests/sock_destroy.c b/tools/testing/selftests/bpf/prog_tests/sock_destroy.c
index a889c53e93c7..afed8cad94ee 100644
--- a/tools/testing/selftests/bpf/prog_tests/sock_destroy.c
+++ b/tools/testing/selftests/bpf/prog_tests/sock_destroy.c
@@ -3,6 +3,7 @@
 #include <bpf/bpf_endian.h>

 #include "sock_destroy_prog.skel.h"
+#include "sock_destroy_prog_fail.skel.h"
 #include "network_helpers.h"

 #define TEST_NS "sock_destroy_netns"
@@ -207,6 +208,8 @@ void test_sock_destroy(void)
                test_udp_server(skel);


+       RUN_TESTS(sock_destroy_prog_fail);
+
 cleanup:
        if (nstoken)
                close_netns(nstoken);
diff --git a/tools/testing/selftests/bpf/progs/sock_destroy_prog_fail.c b/tools/testing/selftests/bpf/progs/sock_destroy_prog_fail.c
new file mode 100644
index 000000000000..dd6850b58e25
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/sock_destroy_prog_fail.c
@@ -0,0 +1,22 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "vmlinux.h"
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_helpers.h>
+
+#include "bpf_misc.h"
+
+char _license[] SEC("license") = "GPL";
+
+int bpf_sock_destroy(struct sock_common *sk) __ksym;
+
+SEC("tp_btf/tcp_destroy_sock")
+__failure __msg("calling kernel function bpf_sock_destroy is not allowed")
+int BPF_PROG(trace_tcp_destroy_sock, struct sock *sk)
+{
+       /* should not load */
+       bpf_sock_destroy((struct sock_common *)sk);
+
+       return 0;
+}

> 
> 
>> Please also fix the subject in the patches. They are all missing the bpf-next and revision tag.
>> 
> 
> Took me a few moments to realize that as I was looking at earlier series. Looks like I forgot to add the tags to subsequent patches in this series. I'll fix it up in the next push.
Martin KaFai Lau May 2, 2023, 11:24 p.m. UTC | #5
On 5/1/23 4:37 PM, Aditi Ghag wrote:
> 
> 
>> On May 1, 2023, at 4:32 PM, Aditi Ghag <aditi.ghag@isovalent.com> wrote:
>>
>>
>>
>>> On Apr 24, 2023, at 3:15 PM, Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>>
>>> On 4/18/23 8:31 AM, Aditi Ghag wrote:
>>>> This patch adds the capability to destroy sockets in BPF. We plan to use
>>>> the capability in Cilium to force client sockets to reconnect when their
>>>> remote load-balancing backends are deleted. The other use case is
>>>> on-the-fly policy enforcement where existing socket connections prevented
>>>> by policies need to be terminated.
>>>
>>> If the earlier kfunc filter patch (https://lore.kernel.org/bpf/1ECC8AAA-C2E6-4F8A-B7D3-5E90BDEE7C48@isovalent.com/) looks fine to you, please include it into the next revision. This patchset needs it. Usual thing to do is to keep my sob (and author if not much has changed) and add your sob. The test needs to be broken out into a separate patch though. It needs to use the '__failure __msg("calling kernel function bpf_sock_destroy is not allowed")'. There are many examples in selftests, eg. the dynptr_fail.c.
>>>
>>
>> Yeah, ok. I was waiting for your confirmation. The patch doesn't need my sob though (maybe tested-by).
>> I've created a separate patch for the test.

I believe your sob is still needed since you will post the patch.

>>
> 
> Ah, looks like the patch is missing a proper description. While I can add something wrt sock_destroy use case, if you have a blurb, feel free to post it here.

Right, some of the RFC commit message is irrelevant. You can develop the 
description based on the useful part of the RFC commit message, like "... added 
a callback filter to 'struct btf_kfunc_id_set'. The filter has
access to the prog such that it can filter by other properties of a prog.
The prog->expected_attached_type is used in the tracing_iter_filter() ...". This 
is the how part. You need to explain why the patch is needed in the commit 
message also.

> 
>>
>>> Please also fix the subject in the patches. They are all missing the bpf-next and revision tag.
>>>
>>
>> Took me a few moments to realize that as I was looking at earlier series. Looks like I forgot to add the tags to subsequent patches in this series. I'll fix it up in the next push.
>
Martin KaFai Lau May 2, 2023, 11:40 p.m. UTC | #6
On 5/2/23 3:52 PM, Aditi Ghag wrote:
> 
> 
>> On May 1, 2023, at 4:32 PM, Aditi Ghag <aditi.ghag@isovalent.com> wrote:
>>
>>
>>
>>> On Apr 24, 2023, at 3:15 PM, Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>>
>>> On 4/18/23 8:31 AM, Aditi Ghag wrote:
>>>> This patch adds the capability to destroy sockets in BPF. We plan to use
>>>> the capability in Cilium to force client sockets to reconnect when their
>>>> remote load-balancing backends are deleted. The other use case is
>>>> on-the-fly policy enforcement where existing socket connections prevented
>>>> by policies need to be terminated.
>>>
>>> If the earlier kfunc filter patch (https://lore.kernel.org/bpf/1ECC8AAA-C2E6-4F8A-B7D3-5E90BDEE7C48@isovalent.com/) looks fine to you, please include it into the next revision. This patchset needs it. Usual thing to do is to keep my sob (and author if not much has changed) and add your sob. The test needs to be broken out into a separate patch though. It needs to use the '__failure __msg("calling kernel function bpf_sock_destroy is not allowed")'. There are many examples in selftests, eg. the dynptr_fail.c.
>>>
>>
>> Yeah, ok. I was waiting for your confirmation. The patch doesn't need my sob though (maybe tested-by).
>> I've created a separate patch for the test.
> 
> 
> Here is the patch diff for the extended test case for your reference. I'm ready to push a new version once I get an ack from you.
Looks reasonable to me.

One thing I have been thinking is the bpf_sock_destroy kfunc should need a 
KF_TRUSTED_ARGS but I suspect that may need a change in the tcp_reg_info in 
tcp_ipv4.c. Not sure yet. Regardless, I don't think this will have a major 
effect on other patches in this set. Please go ahead to respin considering there 
are a few comments that need to be addressed already. At worst it can use one 
final revision to address KF_TRUSTED_ARGS.
[ btw, I don't see your reply/confirmation on the Patch 1 discussion also. 
Please ensure those will also be clarified/addressed in the next respin. ]


> 
> diff --git a/tools/testing/selftests/bpf/prog_tests/sock_destroy.c b/tools/testing/selftests/bpf/prog_tests/sock_destroy.c
> index a889c53e93c7..afed8cad94ee 100644
> --- a/tools/testing/selftests/bpf/prog_tests/sock_destroy.c
> +++ b/tools/testing/selftests/bpf/prog_tests/sock_destroy.c
> @@ -3,6 +3,7 @@
>   #include <bpf/bpf_endian.h>
> 
>   #include "sock_destroy_prog.skel.h"
> +#include "sock_destroy_prog_fail.skel.h"
>   #include "network_helpers.h"
> 
>   #define TEST_NS "sock_destroy_netns"
> @@ -207,6 +208,8 @@ void test_sock_destroy(void)
>                  test_udp_server(skel);
> 
> 
> +       RUN_TESTS(sock_destroy_prog_fail);
> +
>   cleanup:
>          if (nstoken)
>                  close_netns(nstoken);
> diff --git a/tools/testing/selftests/bpf/progs/sock_destroy_prog_fail.c b/tools/testing/selftests/bpf/progs/sock_destroy_prog_fail.c
> new file mode 100644
> index 000000000000..dd6850b58e25
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/sock_destroy_prog_fail.c
> @@ -0,0 +1,22 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include "vmlinux.h"
> +#include <bpf/bpf_tracing.h>
> +#include <bpf/bpf_helpers.h>
> +
> +#include "bpf_misc.h"
> +
> +char _license[] SEC("license") = "GPL";
> +
> +int bpf_sock_destroy(struct sock_common *sk) __ksym;
> +
> +SEC("tp_btf/tcp_destroy_sock")
> +__failure __msg("calling kernel function bpf_sock_destroy is not allowed")
> +int BPF_PROG(trace_tcp_destroy_sock, struct sock *sk)
> +{
> +       /* should not load */
> +       bpf_sock_destroy((struct sock_common *)sk);
> +
> +       return 0;
> +}
> 
>>
>>
>>> Please also fix the subject in the patches. They are all missing the bpf-next and revision tag.
>>>
>>
>> Took me a few moments to realize that as I was looking at earlier series. Looks like I forgot to add the tags to subsequent patches in this series. I'll fix it up in the next push.
>
Aditi Ghag May 4, 2023, 5:32 p.m. UTC | #7
> On May 2, 2023, at 4:40 PM, Martin KaFai Lau <martin.lau@linux.dev> wrote:
> 
> On 5/2/23 3:52 PM, Aditi Ghag wrote:
>>> On May 1, 2023, at 4:32 PM, Aditi Ghag <aditi.ghag@isovalent.com> wrote:
>>> 
>>> 
>>> 
>>>> On Apr 24, 2023, at 3:15 PM, Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>>> 
>>>> On 4/18/23 8:31 AM, Aditi Ghag wrote:
>>>>> This patch adds the capability to destroy sockets in BPF. We plan to use
>>>>> the capability in Cilium to force client sockets to reconnect when their
>>>>> remote load-balancing backends are deleted. The other use case is
>>>>> on-the-fly policy enforcement where existing socket connections prevented
>>>>> by policies need to be terminated.
>>>> 
>>>> If the earlier kfunc filter patch (https://lore.kernel.org/bpf/1ECC8AAA-C2E6-4F8A-B7D3-5E90BDEE7C48@isovalent.com/) looks fine to you, please include it into the next revision. This patchset needs it. Usual thing to do is to keep my sob (and author if not much has changed) and add your sob. The test needs to be broken out into a separate patch though. It needs to use the '__failure __msg("calling kernel function bpf_sock_destroy is not allowed")'. There are many examples in selftests, eg. the dynptr_fail.c.
>>>> 
>>> 
>>> Yeah, ok. I was waiting for your confirmation. The patch doesn't need my sob though (maybe tested-by).
>>> I've created a separate patch for the test.
>> Here is the patch diff for the extended test case for your reference. I'm ready to push a new version once I get an ack from you.
> Looks reasonable to me.
> 
> One thing I have been thinking is the bpf_sock_destroy kfunc should need a KF_TRUSTED_ARGS but I suspect that may need a change in the tcp_reg_info in tcp_ipv4.c. Not sure yet. Regardless, I don't think this will have a major effect on other patches in this set. Please go ahead to respin considering there are a few comments that need to be addressed already. At worst it can use one final revision to address KF_TRUSTED_ARGS.

Pushed the next revision.
Re KF_TRUSTED_ARGS: Looking at the description in the README, it's not entirely clear to me why we need it here now that we are restricting the kfunc to only the iterator programs. Is it somehow related to be able ensure that the socket argument needs to be locked?

> [ btw, I don't see your reply/confirmation on the Patch 1 discussion also. Please ensure those will also be clarified/addressed in the next respin. ]
> 
> 
>> diff --git a/tools/testing/selftests/bpf/prog_tests/sock_destroy.c b/tools/testing/selftests/bpf/prog_tests/sock_destroy.c
>> index a889c53e93c7..afed8cad94ee 100644
>> --- a/tools/testing/selftests/bpf/prog_tests/sock_destroy.c
>> +++ b/tools/testing/selftests/bpf/prog_tests/sock_destroy.c
>> @@ -3,6 +3,7 @@
>>  #include <bpf/bpf_endian.h>
>>  #include "sock_destroy_prog.skel.h"
>> +#include "sock_destroy_prog_fail.skel.h"
>>  #include "network_helpers.h"
>>  #define TEST_NS "sock_destroy_netns"
>> @@ -207,6 +208,8 @@ void test_sock_destroy(void)
>>                 test_udp_server(skel);
>> +       RUN_TESTS(sock_destroy_prog_fail);
>> +
>>  cleanup:
>>         if (nstoken)
>>                 close_netns(nstoken);
>> diff --git a/tools/testing/selftests/bpf/progs/sock_destroy_prog_fail.c b/tools/testing/selftests/bpf/progs/sock_destroy_prog_fail.c
>> new file mode 100644
>> index 000000000000..dd6850b58e25
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/progs/sock_destroy_prog_fail.c
>> @@ -0,0 +1,22 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +
>> +#include "vmlinux.h"
>> +#include <bpf/bpf_tracing.h>
>> +#include <bpf/bpf_helpers.h>
>> +
>> +#include "bpf_misc.h"
>> +
>> +char _license[] SEC("license") = "GPL";
>> +
>> +int bpf_sock_destroy(struct sock_common *sk) __ksym;
>> +
>> +SEC("tp_btf/tcp_destroy_sock")
>> +__failure __msg("calling kernel function bpf_sock_destroy is not allowed")
>> +int BPF_PROG(trace_tcp_destroy_sock, struct sock *sk)
>> +{
>> +       /* should not load */
>> +       bpf_sock_destroy((struct sock_common *)sk);
>> +
>> +       return 0;
>> +}
>>> 
>>> 
>>>> Please also fix the subject in the patches. They are all missing the bpf-next and revision tag.
>>>> 
>>> 
>>> Took me a few moments to realize that as I was looking at earlier series. Looks like I forgot to add the tags to subsequent patches in this series. I'll fix it up in the next push.