diff mbox series

[v2,bpf-next,3/5] bpf: Allow kfuncs to be used in LSM programs

Message ID 20220621012811.2683313-4-kpsingh@kernel.org (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series Add bpf_getxattr | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for bpf-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 6 this patch: 6
netdev/cc_maintainers warning 5 maintainers not CCed: netdev@vger.kernel.org songliubraving@fb.com yhs@fb.com john.fastabend@gmail.com kafai@fb.com
netdev/build_clang success Errors and warnings before: 6 this patch: 6
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 6 this patch: 6
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 7 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-1 success Logs for Kernel LATEST on ubuntu-latest with gcc
bpf/vmtest-bpf-next-VM_Test-2 success Logs for Kernel LATEST on ubuntu-latest with llvm-15
bpf/vmtest-bpf-next-PR fail PR summary
bpf/vmtest-bpf-next-VM_Test-3 fail Logs for Kernel LATEST on z15 with gcc

Commit Message

KP Singh June 21, 2022, 1:28 a.m. UTC
In preparation for the addition of bpf_getxattr kfunc.

Signed-off-by: KP Singh <kpsingh@kernel.org>
---
 kernel/bpf/btf.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Kumar Kartikeya Dwivedi June 21, 2022, 12:36 p.m. UTC | #1
On Tue, Jun 21, 2022 at 06:58:09AM IST, KP Singh wrote:
> In preparation for the addition of bpf_getxattr kfunc.
>
> Signed-off-by: KP Singh <kpsingh@kernel.org>
> ---
>  kernel/bpf/btf.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index 02d7951591ae..541cf4635aa1 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -7264,6 +7264,7 @@ static int bpf_prog_type_to_kfunc_hook(enum bpf_prog_type prog_type)
>  	case BPF_PROG_TYPE_STRUCT_OPS:
>  		return BTF_KFUNC_HOOK_STRUCT_OPS;
>  	case BPF_PROG_TYPE_TRACING:
> +	case BPF_PROG_TYPE_LSM:
>  		return BTF_KFUNC_HOOK_TRACING;

Should we define another BTF_KFUNC_HOOK_LSM instead? Otherwise when you register
for tracing or lsm progs, you write to the same hook instead, so kfunc enabled
for tracing progs also gets enabled for lsm, I guess that is not what user
intends when registering kfunc set.

>  	case BPF_PROG_TYPE_SYSCALL:
>  		return BTF_KFUNC_HOOK_SYSCALL;
> --
> 2.37.0.rc0.104.g0611611a94-goog
>

--
Kartikeya
Alexei Starovoitov June 21, 2022, 4 p.m. UTC | #2
On Tue, Jun 21, 2022 at 5:36 AM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> On Tue, Jun 21, 2022 at 06:58:09AM IST, KP Singh wrote:
> > In preparation for the addition of bpf_getxattr kfunc.
> >
> > Signed-off-by: KP Singh <kpsingh@kernel.org>
> > ---
> >  kernel/bpf/btf.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > index 02d7951591ae..541cf4635aa1 100644
> > --- a/kernel/bpf/btf.c
> > +++ b/kernel/bpf/btf.c
> > @@ -7264,6 +7264,7 @@ static int bpf_prog_type_to_kfunc_hook(enum bpf_prog_type prog_type)
> >       case BPF_PROG_TYPE_STRUCT_OPS:
> >               return BTF_KFUNC_HOOK_STRUCT_OPS;
> >       case BPF_PROG_TYPE_TRACING:
> > +     case BPF_PROG_TYPE_LSM:
> >               return BTF_KFUNC_HOOK_TRACING;
>
> Should we define another BTF_KFUNC_HOOK_LSM instead? Otherwise when you register
> for tracing or lsm progs, you write to the same hook instead, so kfunc enabled
> for tracing progs also gets enabled for lsm, I guess that is not what user
> intends when registering kfunc set.

It's probably ok for this case.
We might combine BTF_KFUNC_HOOK* into fewer hooks and
scope them by attach_btf_id.
Everything is 'tracing' like at the end.
Upcoming hid-bpf is 'tracing'. lsm is 'tracing'.
but we may need to reduce the scope of kfuncs
based on attach point or based on argument type.
tc vs xdp don't need to be separate sets.
Their types (skb vs xdp_md) are different, so only
right kfuncs can be used, but bpf_ct_release() is needed
in both tc and xdp.
So we could combine tc and xdp into 'btf_kfunc_hook_networking'
without compromising the safety.
acquire vs release ideally would be indicated with btf_tag,
but gcc still doesn't have support for btf_tag and we cannot
require the kernel to be built with clang.
acquire, release, sleepable, ret_null should be flags on a kfunc
instead of a set. It would be easier to manage and less boilerplate
code. Not clear how to do this cleanly.
export_symbol approach is a bit heavy and requires name being unique
across kernel and modules.
func name mangling or typedef-ing comes to mind. not so clean either.
KP Singh June 21, 2022, 4:15 p.m. UTC | #3
On Tue, Jun 21, 2022 at 6:01 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Tue, Jun 21, 2022 at 5:36 AM Kumar Kartikeya Dwivedi
> <memxor@gmail.com> wrote:
> >
> > On Tue, Jun 21, 2022 at 06:58:09AM IST, KP Singh wrote:
> > > In preparation for the addition of bpf_getxattr kfunc.
> > >
> > > Signed-off-by: KP Singh <kpsingh@kernel.org>

[...]

> > > +++ b/kernel/bpf/btf.c
> > > @@ -7264,6 +7264,7 @@ static int bpf_prog_type_to_kfunc_hook(enum bpf_prog_type prog_type)
> > >       case BPF_PROG_TYPE_STRUCT_OPS:
> > >               return BTF_KFUNC_HOOK_STRUCT_OPS;
> > >       case BPF_PROG_TYPE_TRACING:
> > > +     case BPF_PROG_TYPE_LSM:
> > >               return BTF_KFUNC_HOOK_TRACING;
> >
> > Should we define another BTF_KFUNC_HOOK_LSM instead? Otherwise when you register
> > for tracing or lsm progs, you write to the same hook instead, so kfunc enabled
> > for tracing progs also gets enabled for lsm, I guess that is not what user
> > intends when registering kfunc set.
>
> It's probably ok for this case.
> We might combine BTF_KFUNC_HOOK* into fewer hooks and

I did this intentionally. We do this similarly for helpers too and we
unfortunately, we do realize that LSM hooks are not there in all
places (esp. where one is auditing stuff).

So in reality one does use a combination of LSM and tracing hooks
with the policy decision coming from the LSM hook but the state
gets accumulated from different hooks.


> scope them by attach_btf_id.
> Everything is 'tracing' like at the end.
> Upcoming hid-bpf is 'tracing'. lsm is 'tracing'.
> but we may need to reduce the scope of kfuncs
> based on attach point or based on argument type.
> tc vs xdp don't need to be separate sets.
> Their types (skb vs xdp_md) are different, so only
> right kfuncs can be used, but bpf_ct_release() is needed
> in both tc and xdp.
> So we could combine tc and xdp into 'btf_kfunc_hook_networking'
> without compromising the safety.
> acquire vs release ideally would be indicated with btf_tag,
> but gcc still doesn't have support for btf_tag and we cannot
> require the kernel to be built with clang.
> acquire, release, sleepable, ret_null should be flags on a kfunc
> instead of a set. It would be easier to manage and less boilerplate

+1 This would be awesome, I gave it a shot to use btf_tag but hit this
feature gap in GCC.

- KP

> code. Not clear how to do this cleanly.
> export_symbol approach is a bit heavy and requires name being unique
> across kernel and modules.
> func name mangling or typedef-ing comes to mind. not so clean either.
Kumar Kartikeya Dwivedi June 21, 2022, 5:35 p.m. UTC | #4
On Tue, Jun 21, 2022 at 09:45:15PM IST, KP Singh wrote:
> On Tue, Jun 21, 2022 at 6:01 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Tue, Jun 21, 2022 at 5:36 AM Kumar Kartikeya Dwivedi
> > <memxor@gmail.com> wrote:
> > >
> > > On Tue, Jun 21, 2022 at 06:58:09AM IST, KP Singh wrote:
> > > > In preparation for the addition of bpf_getxattr kfunc.
> > > >
> > > > Signed-off-by: KP Singh <kpsingh@kernel.org>
>
> [...]
>
> > > > +++ b/kernel/bpf/btf.c
> > > > @@ -7264,6 +7264,7 @@ static int bpf_prog_type_to_kfunc_hook(enum bpf_prog_type prog_type)
> > > >       case BPF_PROG_TYPE_STRUCT_OPS:
> > > >               return BTF_KFUNC_HOOK_STRUCT_OPS;
> > > >       case BPF_PROG_TYPE_TRACING:
> > > > +     case BPF_PROG_TYPE_LSM:
> > > >               return BTF_KFUNC_HOOK_TRACING;
> > >
> > > Should we define another BTF_KFUNC_HOOK_LSM instead? Otherwise when you register
> > > for tracing or lsm progs, you write to the same hook instead, so kfunc enabled
> > > for tracing progs also gets enabled for lsm, I guess that is not what user
> > > intends when registering kfunc set.
> >
> > It's probably ok for this case.
> > We might combine BTF_KFUNC_HOOK* into fewer hooks and
>
> I did this intentionally. We do this similarly for helpers too and we
> unfortunately, we do realize that LSM hooks are not there in all
> places (esp. where one is auditing stuff).
>
> So in reality one does use a combination of LSM and tracing hooks
> with the policy decision coming from the LSM hook but the state
> gets accumulated from different hooks.
>
>
> > scope them by attach_btf_id.
> > Everything is 'tracing' like at the end.
> > Upcoming hid-bpf is 'tracing'. lsm is 'tracing'.
> > but we may need to reduce the scope of kfuncs
> > based on attach point or based on argument type.
> > tc vs xdp don't need to be separate sets.
> > Their types (skb vs xdp_md) are different, so only
> > right kfuncs can be used, but bpf_ct_release() is needed
> > in both tc and xdp.
> > So we could combine tc and xdp into 'btf_kfunc_hook_networking'
> > without compromising the safety.
> > acquire vs release ideally would be indicated with btf_tag,
> > but gcc still doesn't have support for btf_tag and we cannot
> > require the kernel to be built with clang.
> > acquire, release, sleepable, ret_null should be flags on a kfunc
> > instead of a set. It would be easier to manage and less boilerplate
>
> +1 This would be awesome, I gave it a shot to use btf_tag but hit this
> feature gap in GCC.
>
> - KP
>
> > code. Not clear how to do this cleanly.
> > export_symbol approach is a bit heavy and requires name being unique
> > across kernel and modules.
> > func name mangling or typedef-ing comes to mind. not so clean either.

How does this approach look to 'tag' a kfunc?

https://godbolt.org/z/jGeK6Y49K

Then we find the function whose symbol should be emitted and available in BTF
with the prefix (__kfunc_name_) and read off the tag values. Due to the extern
inline it should still inline the definition, so there is no overhead at
runtime.

As you see in godbolt, __foobar_1_2 symbol is visible, with 1 and 2 being tag
values. Substituting tag name with underlying value makes parsing easier. We
can have fixed number of slots and keep others as 0 (ignore).

It can be extended to also tag arguments, which I already did before at [0], but
never sent out. Then we don't need the suffixes or special naming of arguments
to indicate some tag.

e.g.

bpf_ct_release will be:

KFUNC_CALL_0(void, bpf_ct_release, __kfunc_release,
	     KARG(struct nf_conn *, nfct, __ptr_ref))
{
	...
}

__kfunc_acquire and __kfunc_ret_null will go in the same place as __kfunc_release.

We may also allow tagging as ptr_to_btf_id or ptr_to_mem etc. to force passing a
certain register type and avoid ambiguity.

 [0]: https://gist.github.com/kkdwivedi/7839cc9e4f002acc3e15350b1b86c88c

--
Kartikeya
Alexei Starovoitov June 21, 2022, 8:11 p.m. UTC | #5
On Tue, Jun 21, 2022 at 11:05:45PM +0530, Kumar Kartikeya Dwivedi wrote:
> On Tue, Jun 21, 2022 at 09:45:15PM IST, KP Singh wrote:
> > On Tue, Jun 21, 2022 at 6:01 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Tue, Jun 21, 2022 at 5:36 AM Kumar Kartikeya Dwivedi
> > > <memxor@gmail.com> wrote:
> > > >
> > > > On Tue, Jun 21, 2022 at 06:58:09AM IST, KP Singh wrote:
> > > > > In preparation for the addition of bpf_getxattr kfunc.
> > > > >
> > > > > Signed-off-by: KP Singh <kpsingh@kernel.org>
> >
> > [...]
> >
> > > > > +++ b/kernel/bpf/btf.c
> > > > > @@ -7264,6 +7264,7 @@ static int bpf_prog_type_to_kfunc_hook(enum bpf_prog_type prog_type)
> > > > >       case BPF_PROG_TYPE_STRUCT_OPS:
> > > > >               return BTF_KFUNC_HOOK_STRUCT_OPS;
> > > > >       case BPF_PROG_TYPE_TRACING:
> > > > > +     case BPF_PROG_TYPE_LSM:
> > > > >               return BTF_KFUNC_HOOK_TRACING;
> > > >
> > > > Should we define another BTF_KFUNC_HOOK_LSM instead? Otherwise when you register
> > > > for tracing or lsm progs, you write to the same hook instead, so kfunc enabled
> > > > for tracing progs also gets enabled for lsm, I guess that is not what user
> > > > intends when registering kfunc set.
> > >
> > > It's probably ok for this case.
> > > We might combine BTF_KFUNC_HOOK* into fewer hooks and
> >
> > I did this intentionally. We do this similarly for helpers too and we
> > unfortunately, we do realize that LSM hooks are not there in all
> > places (esp. where one is auditing stuff).
> >
> > So in reality one does use a combination of LSM and tracing hooks
> > with the policy decision coming from the LSM hook but the state
> > gets accumulated from different hooks.
> >
> >
> > > scope them by attach_btf_id.
> > > Everything is 'tracing' like at the end.
> > > Upcoming hid-bpf is 'tracing'. lsm is 'tracing'.
> > > but we may need to reduce the scope of kfuncs
> > > based on attach point or based on argument type.
> > > tc vs xdp don't need to be separate sets.
> > > Their types (skb vs xdp_md) are different, so only
> > > right kfuncs can be used, but bpf_ct_release() is needed
> > > in both tc and xdp.
> > > So we could combine tc and xdp into 'btf_kfunc_hook_networking'
> > > without compromising the safety.
> > > acquire vs release ideally would be indicated with btf_tag,
> > > but gcc still doesn't have support for btf_tag and we cannot
> > > require the kernel to be built with clang.
> > > acquire, release, sleepable, ret_null should be flags on a kfunc
> > > instead of a set. It would be easier to manage and less boilerplate
> >
> > +1 This would be awesome, I gave it a shot to use btf_tag but hit this
> > feature gap in GCC.
> >
> > - KP
> >
> > > code. Not clear how to do this cleanly.
> > > export_symbol approach is a bit heavy and requires name being unique
> > > across kernel and modules.
> > > func name mangling or typedef-ing comes to mind. not so clean either.
> 
> How does this approach look to 'tag' a kfunc?
> 
> https://godbolt.org/z/jGeK6Y49K
> 
> Then we find the function whose symbol should be emitted and available in BTF
> with the prefix (__kfunc_name_) and read off the tag values. Due to the extern
> inline it should still inline the definition, so there is no overhead at
> runtime.

No run-time overhead, but code size duplication is prohibitive.
Both functions will have full body.
We can hack around with another noinline function that both will call,
but it's getting ugly.

> As you see in godbolt, __foobar_1_2 symbol is visible, with 1 and 2 being tag
> values. Substituting tag name with underlying value makes parsing easier. We
> can have fixed number of slots and keep others as 0 (ignore).
> 
> It can be extended to also tag arguments, which I already did before at [0], but
> never sent out. Then we don't need the suffixes or special naming of arguments
> to indicate some tag.
> 
> e.g.
> 
> bpf_ct_release will be:
> 
> KFUNC_CALL_0(void, bpf_ct_release, __kfunc_release,
> 	     KARG(struct nf_conn *, nfct, __ptr_ref))
> {
> 	...
> }

-struct nf_conn *
-bpf_skb_ct_lookup(struct __sk_buff *skb_ctx, struct bpf_sock_tuple *bpf_tuple,
-		  u32 tuple__sz, struct bpf_ct_opts *opts, u32 opts__sz)
+KFUNC_CALL_5(struct nf_conn *, bpf_skb_ct_lookup,
+	     KARG(struct __sk_buff *, skb_ctx, PTR_TO_CTX),
+	     KARG(struct bpf_sock_tuple *, bpf_tuple, PTR_TO_MEM),
+	     KARG(u32, tuple__sz, SIZE),
+	     KARG(struct bpf_ct_opts *, opts, PTR_TO_MEM),
+	     KARG(u32, opts__sz, SIZE))

I think it's too ugly for majority of kernel folks to accept.
We will have a lot more kfuncs than we have now. Above code will be
present in many different subsystems. The trade-off is not worth it.
__sz suffix convention works. Similar approach can be used.
This is orthogonal discussion anyway.

Back to kfunc tagging.
How about we use
  BTF_SET_START8(list)
  BTF_ID_FLAGS(func, bpf_xdp_ct_lookup, ACQUIRE | RET_NULL)
  BTF_ID_FLAGS(func, bpf_ct_release, RELEASE)
  BTF_SET_END8(list)

where BTF_SET_START8 macro will do "__BTF_ID__set8__" #name
so that resolve_btfid side knows that it needs to ignore 4 bytes after btf_id
when it's doing set sorting.
The kernel side btf_id_set_contains() would need to call bsearch()
with sizeof(u64) instead of sizeof(u32).

The verifier side will be cleaner and faster too.
Instead of two bsearch calls (that are about to become 3 calls with sleepable set):
                rel = btf_kfunc_id_set_contains(btf, resolve_prog_type(env->prog),
                                                BTF_KFUNC_TYPE_RELEASE, func_id);
                kptr_get = btf_kfunc_id_set_contains(btf, resolve_prog_type(env->prog),
                                                     BTF_KFUNC_TYPE_KPTR_ACQUIRE, func_id);
It will be a single call plus flag check.
KP Singh June 21, 2022, 8:25 p.m. UTC | #6
On Tue, Jun 21, 2022 at 10:12 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Tue, Jun 21, 2022 at 11:05:45PM +0530, Kumar Kartikeya Dwivedi wrote:
> > On Tue, Jun 21, 2022 at 09:45:15PM IST, KP Singh wrote:
> > > On Tue, Jun 21, 2022 at 6:01 PM Alexei Starovoitov
> > > <alexei.starovoitov@gmail.com> wrote:
> > > >
> > > > On Tue, Jun 21, 2022 at 5:36 AM Kumar Kartikeya Dwivedi
> > > > <memxor@gmail.com> wrote:
> > > > >
> > > > > On Tue, Jun 21, 2022 at 06:58:09AM IST, KP Singh wrote:
> > > > > > In preparation for the addition of bpf_getxattr kfunc.
> > > > > >
> > > > > > Signed-off-by: KP Singh <kpsingh@kernel.org>
> > >
> > > [...]
> > >
> > > > > > +++ b/kernel/bpf/btf.c
> > > > > > @@ -7264,6 +7264,7 @@ static int bpf_prog_type_to_kfunc_hook(enum bpf_prog_type prog_type)
> > > > > >       case BPF_PROG_TYPE_STRUCT_OPS:
> > > > > >               return BTF_KFUNC_HOOK_STRUCT_OPS;
> > > > > >       case BPF_PROG_TYPE_TRACING:
> > > > > > +     case BPF_PROG_TYPE_LSM:
> > > > > >               return BTF_KFUNC_HOOK_TRACING;
> > > > >
> > > > > Should we define another BTF_KFUNC_HOOK_LSM instead? Otherwise when you register
> > > > > for tracing or lsm progs, you write to the same hook instead, so kfunc enabled
> > > > > for tracing progs also gets enabled for lsm, I guess that is not what user
> > > > > intends when registering kfunc set.
> > > >
> > > > It's probably ok for this case.
> > > > We might combine BTF_KFUNC_HOOK* into fewer hooks and
> > >
> > > I did this intentionally. We do this similarly for helpers too and we
> > > unfortunately, we do realize that LSM hooks are not there in all
> > > places (esp. where one is auditing stuff).
> > >
> > > So in reality one does use a combination of LSM and tracing hooks
> > > with the policy decision coming from the LSM hook but the state
> > > gets accumulated from different hooks.
> > >
> > >
> > > > scope them by attach_btf_id.
> > > > Everything is 'tracing' like at the end.
> > > > Upcoming hid-bpf is 'tracing'. lsm is 'tracing'.
> > > > but we may need to reduce the scope of kfuncs
> > > > based on attach point or based on argument type.
> > > > tc vs xdp don't need to be separate sets.
> > > > Their types (skb vs xdp_md) are different, so only
> > > > right kfuncs can be used, but bpf_ct_release() is needed
> > > > in both tc and xdp.
> > > > So we could combine tc and xdp into 'btf_kfunc_hook_networking'
> > > > without compromising the safety.
> > > > acquire vs release ideally would be indicated with btf_tag,
> > > > but gcc still doesn't have support for btf_tag and we cannot
> > > > require the kernel to be built with clang.
> > > > acquire, release, sleepable, ret_null should be flags on a kfunc
> > > > instead of a set. It would be easier to manage and less boilerplate
> > >
> > > +1 This would be awesome, I gave it a shot to use btf_tag but hit this
> > > feature gap in GCC.
> > >
> > > - KP
> > >
> > > > code. Not clear how to do this cleanly.
> > > > export_symbol approach is a bit heavy and requires name being unique
> > > > across kernel and modules.
> > > > func name mangling or typedef-ing comes to mind. not so clean either.
> >
> > How does this approach look to 'tag' a kfunc?
> >
> > https://godbolt.org/z/jGeK6Y49K
> >
> > Then we find the function whose symbol should be emitted and available in BTF
> > with the prefix (__kfunc_name_) and read off the tag values. Due to the extern
> > inline it should still inline the definition, so there is no overhead at
> > runtime.
>
> No run-time overhead, but code size duplication is prohibitive.
> Both functions will have full body.
> We can hack around with another noinline function that both will call,
> but it's getting ugly.
>
> > As you see in godbolt, __foobar_1_2 symbol is visible, with 1 and 2 being tag
> > values. Substituting tag name with underlying value makes parsing easier. We
> > can have fixed number of slots and keep others as 0 (ignore).
> >
> > It can be extended to also tag arguments, which I already did before at [0], but
> > never sent out. Then we don't need the suffixes or special naming of arguments
> > to indicate some tag.
> >
> > e.g.
> >
> > bpf_ct_release will be:
> >
> > KFUNC_CALL_0(void, bpf_ct_release, __kfunc_release,
> >            KARG(struct nf_conn *, nfct, __ptr_ref))
> > {
> >       ...
> > }
>
> -struct nf_conn *
> -bpf_skb_ct_lookup(struct __sk_buff *skb_ctx, struct bpf_sock_tuple *bpf_tuple,
> -                 u32 tuple__sz, struct bpf_ct_opts *opts, u32 opts__sz)
> +KFUNC_CALL_5(struct nf_conn *, bpf_skb_ct_lookup,
> +            KARG(struct __sk_buff *, skb_ctx, PTR_TO_CTX),
> +            KARG(struct bpf_sock_tuple *, bpf_tuple, PTR_TO_MEM),
> +            KARG(u32, tuple__sz, SIZE),
> +            KARG(struct bpf_ct_opts *, opts, PTR_TO_MEM),
> +            KARG(u32, opts__sz, SIZE))
>
> I think it's too ugly for majority of kernel folks to accept.

I agree. It's quite a creative idea :D

but will pollute the symbol table a lot.

> We will have a lot more kfuncs than we have now. Above code will be
> present in many different subsystems. The trade-off is not worth it.
> __sz suffix convention works. Similar approach can be used.
> This is orthogonal discussion anyway.
>
> Back to kfunc tagging.
> How about we use
>   BTF_SET_START8(list)
>   BTF_ID_FLAGS(func, bpf_xdp_ct_lookup, ACQUIRE | RET_NULL)
>   BTF_ID_FLAGS(func, bpf_ct_release, RELEASE)
>   BTF_SET_END8(list)

SGTM. Kumar is this something you can send a patch for?

Also, while we are at it, it would be great documenting what these sets
are in:

https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/include/linux/btf.h#n31

- KP




>
> where BTF_SET_START8 macro will do "__BTF_ID__set8__" #name
> so that resolve_btfid side knows that it needs to ignore 4 bytes after btf_id
> when it's doing set sorting.
> The kernel side btf_id_set_contains() would need to call bsearch()
> with sizeof(u64) instead of sizeof(u32).
>
> The verifier side will be cleaner and faster too.
> Instead of two bsearch calls (that are about to become 3 calls with sleepable set):
>                 rel = btf_kfunc_id_set_contains(btf, resolve_prog_type(env->prog),
>                                                 BTF_KFUNC_TYPE_RELEASE, func_id);
>                 kptr_get = btf_kfunc_id_set_contains(btf, resolve_prog_type(env->prog),
>                                                      BTF_KFUNC_TYPE_KPTR_ACQUIRE, func_id);
> It will be a single call plus flag check.
Kumar Kartikeya Dwivedi June 22, 2022, 1:17 a.m. UTC | #7
On Wed, Jun 22, 2022 at 01:55:29AM IST, KP Singh wrote:
> On Tue, Jun 21, 2022 at 10:12 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Tue, Jun 21, 2022 at 11:05:45PM +0530, Kumar Kartikeya Dwivedi wrote:
> > > On Tue, Jun 21, 2022 at 09:45:15PM IST, KP Singh wrote:
> > > > On Tue, Jun 21, 2022 at 6:01 PM Alexei Starovoitov
> > > > <alexei.starovoitov@gmail.com> wrote:
> > > > >
> > > > > On Tue, Jun 21, 2022 at 5:36 AM Kumar Kartikeya Dwivedi
> > > > > <memxor@gmail.com> wrote:
> > > > > >
> > > > > > On Tue, Jun 21, 2022 at 06:58:09AM IST, KP Singh wrote:
> > > > > > > In preparation for the addition of bpf_getxattr kfunc.
> > > > > > >
> > > > > > > Signed-off-by: KP Singh <kpsingh@kernel.org>
> > > >
> > > > [...]
> > > >
> > > > > > > +++ b/kernel/bpf/btf.c
> > > > > > > @@ -7264,6 +7264,7 @@ static int bpf_prog_type_to_kfunc_hook(enum bpf_prog_type prog_type)
> > > > > > >       case BPF_PROG_TYPE_STRUCT_OPS:
> > > > > > >               return BTF_KFUNC_HOOK_STRUCT_OPS;
> > > > > > >       case BPF_PROG_TYPE_TRACING:
> > > > > > > +     case BPF_PROG_TYPE_LSM:
> > > > > > >               return BTF_KFUNC_HOOK_TRACING;
> > > > > >
> > > > > > Should we define another BTF_KFUNC_HOOK_LSM instead? Otherwise when you register
> > > > > > for tracing or lsm progs, you write to the same hook instead, so kfunc enabled
> > > > > > for tracing progs also gets enabled for lsm, I guess that is not what user
> > > > > > intends when registering kfunc set.
> > > > >
> > > > > It's probably ok for this case.
> > > > > We might combine BTF_KFUNC_HOOK* into fewer hooks and
> > > >
> > > > I did this intentionally. We do this similarly for helpers too and we
> > > > unfortunately, we do realize that LSM hooks are not there in all
> > > > places (esp. where one is auditing stuff).
> > > >
> > > > So in reality one does use a combination of LSM and tracing hooks
> > > > with the policy decision coming from the LSM hook but the state
> > > > gets accumulated from different hooks.
> > > >
> > > >
> > > > > scope them by attach_btf_id.
> > > > > Everything is 'tracing' like at the end.
> > > > > Upcoming hid-bpf is 'tracing'. lsm is 'tracing'.
> > > > > but we may need to reduce the scope of kfuncs
> > > > > based on attach point or based on argument type.
> > > > > tc vs xdp don't need to be separate sets.
> > > > > Their types (skb vs xdp_md) are different, so only
> > > > > right kfuncs can be used, but bpf_ct_release() is needed
> > > > > in both tc and xdp.
> > > > > So we could combine tc and xdp into 'btf_kfunc_hook_networking'
> > > > > without compromising the safety.
> > > > > acquire vs release ideally would be indicated with btf_tag,
> > > > > but gcc still doesn't have support for btf_tag and we cannot
> > > > > require the kernel to be built with clang.
> > > > > acquire, release, sleepable, ret_null should be flags on a kfunc
> > > > > instead of a set. It would be easier to manage and less boilerplate
> > > >
> > > > +1 This would be awesome, I gave it a shot to use btf_tag but hit this
> > > > feature gap in GCC.
> > > >
> > > > - KP
> > > >
> > > > > code. Not clear how to do this cleanly.
> > > > > export_symbol approach is a bit heavy and requires name being unique
> > > > > across kernel and modules.
> > > > > func name mangling or typedef-ing comes to mind. not so clean either.
> > >
> > > How does this approach look to 'tag' a kfunc?
> > >
> > > https://godbolt.org/z/jGeK6Y49K
> > >
> > > Then we find the function whose symbol should be emitted and available in BTF
> > > with the prefix (__kfunc_name_) and read off the tag values. Due to the extern
> > > inline it should still inline the definition, so there is no overhead at
> > > runtime.
> >
> > No run-time overhead, but code size duplication is prohibitive.
> > Both functions will have full body.
> > We can hack around with another noinline function that both will call,
> > but it's getting ugly.
> >

You can just make the __foobar_1_2 noinline instead: https://godbolt.org/z/zT18h81T6

but anyway...

> > > As you see in godbolt, __foobar_1_2 symbol is visible, with 1 and 2 being tag
> > > values. Substituting tag name with underlying value makes parsing easier. We
> > > can have fixed number of slots and keep others as 0 (ignore).
> > >
> > > It can be extended to also tag arguments, which I already did before at [0], but
> > > never sent out. Then we don't need the suffixes or special naming of arguments
> > > to indicate some tag.
> > >
> > > e.g.
> > >
> > > bpf_ct_release will be:
> > >
> > > KFUNC_CALL_0(void, bpf_ct_release, __kfunc_release,
> > >            KARG(struct nf_conn *, nfct, __ptr_ref))
> > > {
> > >       ...
> > > }
> >
> > -struct nf_conn *
> > -bpf_skb_ct_lookup(struct __sk_buff *skb_ctx, struct bpf_sock_tuple *bpf_tuple,
> > -                 u32 tuple__sz, struct bpf_ct_opts *opts, u32 opts__sz)
> > +KFUNC_CALL_5(struct nf_conn *, bpf_skb_ct_lookup,
> > +            KARG(struct __sk_buff *, skb_ctx, PTR_TO_CTX),
> > +            KARG(struct bpf_sock_tuple *, bpf_tuple, PTR_TO_MEM),
> > +            KARG(u32, tuple__sz, SIZE),
> > +            KARG(struct bpf_ct_opts *, opts, PTR_TO_MEM),
> > +            KARG(u32, opts__sz, SIZE))
> >
> > I think it's too ugly for majority of kernel folks to accept.
>
> I agree. It's quite a creative idea :D
>
> but will pollute the symbol table a lot.
>
> > We will have a lot more kfuncs than we have now. Above code will be
> > present in many different subsystems. The trade-off is not worth it.
> > __sz suffix convention works. Similar approach can be used.
> > This is orthogonal discussion anyway.
> >
> > Back to kfunc tagging.
> > How about we use
> >   BTF_SET_START8(list)
> >   BTF_ID_FLAGS(func, bpf_xdp_ct_lookup, ACQUIRE | RET_NULL)
> >   BTF_ID_FLAGS(func, bpf_ct_release, RELEASE)
> >   BTF_SET_END8(list)
>
> SGTM. Kumar is this something you can send a patch for?
>
> Also, while we are at it, it would be great documenting what these sets
> are in:
>

Yes, the resolve_btfids method looks ok to me. I will send a patch for both of
these things.

> https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/include/linux/btf.h#n31
>
> - KP
>
>
>
>
> >
> > where BTF_SET_START8 macro will do "__BTF_ID__set8__" #name
> > so that resolve_btfid side knows that it needs to ignore 4 bytes after btf_id
> > when it's doing set sorting.
> > The kernel side btf_id_set_contains() would need to call bsearch()
> > with sizeof(u64) instead of sizeof(u32).
> >
> > The verifier side will be cleaner and faster too.
> > Instead of two bsearch calls (that are about to become 3 calls with sleepable set):
> >                 rel = btf_kfunc_id_set_contains(btf, resolve_prog_type(env->prog),
> >                                                 BTF_KFUNC_TYPE_RELEASE, func_id);
> >                 kptr_get = btf_kfunc_id_set_contains(btf, resolve_prog_type(env->prog),
> >                                                      BTF_KFUNC_TYPE_KPTR_ACQUIRE, func_id);
> > It will be a single call plus flag check.

--
Kartikeya
diff mbox series

Patch

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 02d7951591ae..541cf4635aa1 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -7264,6 +7264,7 @@  static int bpf_prog_type_to_kfunc_hook(enum bpf_prog_type prog_type)
 	case BPF_PROG_TYPE_STRUCT_OPS:
 		return BTF_KFUNC_HOOK_STRUCT_OPS;
 	case BPF_PROG_TYPE_TRACING:
+	case BPF_PROG_TYPE_LSM:
 		return BTF_KFUNC_HOOK_TRACING;
 	case BPF_PROG_TYPE_SYSCALL:
 		return BTF_KFUNC_HOOK_SYSCALL;