diff mbox series

[2/4] bpf: Remove size check for sk in bpf_skb_is_valid_access for 32-bit architecture

Message ID 20221103083254.237646-3-yangjihong1@huawei.com (mailing list archive)
State New
Headers show
Series bpf: Support kernel function call in 32-bit ARM | expand

Commit Message

Yang Jihong Nov. 3, 2022, 8:32 a.m. UTC
The error code -EACCES is returned when bpf prog is tested in 32-bit environment,
This is because bpf_object__relocate modifies the instruction to change memory
size to 4 bytes, as shown in the following messages:

libbpf: prog 'kfunc_call_test1': relo #2: matching candidate #0 <byte_off> [18342] struct __sk_buff.sk (0:30:0 @ offset 168)
libbpf: prog 'kfunc_call_test1': relo #2: patched insn #1 (LDX/ST/STX) off 168 -> 168
libbpf: prog 'kfunc_call_test1': relo #2: patched insn #1 (LDX/ST/STX) mem_sz 8 -> 4

As a result, the bpf_skb_is_valid_access check fails. For 32-bit architecture,
unnecessary checks need to be deleted.

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
---
 net/core/filter.c | 2 --
 1 file changed, 2 deletions(-)

Comments

Andrii Nakryiko Nov. 4, 2022, 9:56 p.m. UTC | #1
On Thu, Nov 3, 2022 at 1:36 AM Yang Jihong <yangjihong1@huawei.com> wrote:
>
> The error code -EACCES is returned when bpf prog is tested in 32-bit environment,
> This is because bpf_object__relocate modifies the instruction to change memory
> size to 4 bytes, as shown in the following messages:
>
> libbpf: prog 'kfunc_call_test1': relo #2: matching candidate #0 <byte_off> [18342] struct __sk_buff.sk (0:30:0 @ offset 168)
> libbpf: prog 'kfunc_call_test1': relo #2: patched insn #1 (LDX/ST/STX) off 168 -> 168
> libbpf: prog 'kfunc_call_test1': relo #2: patched insn #1 (LDX/ST/STX) mem_sz 8 -> 4
>
> As a result, the bpf_skb_is_valid_access check fails. For 32-bit architecture,
> unnecessary checks need to be deleted.
>
> Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
> ---
>  net/core/filter.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/net/core/filter.c b/net/core/filter.c
> index bb0136e7a8e4..eab7ce89740c 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -8269,8 +8269,6 @@ static bool bpf_skb_is_valid_access(int off, int size, enum bpf_access_type type
>                         return false;
>                 break;
>         case offsetof(struct __sk_buff, sk):
> -               if (type == BPF_WRITE || size != sizeof(__u64))
> -                       return false;

this probably should be specific to host architecture bitness? I'd
imagine that size = 4 should be invalid on 64-bit arches (reading half
of the pointer is bad)

either way, please make sure to add tests specifically for this case
in test_verifier



>                 info->reg_type = PTR_TO_SOCK_COMMON_OR_NULL;
>                 break;
>         case offsetof(struct __sk_buff, tstamp_type):
> --
> 2.30.GIT
>
Alexei Starovoitov Nov. 4, 2022, 11:32 p.m. UTC | #2
On Fri, Nov 4, 2022 at 2:56 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Thu, Nov 3, 2022 at 1:36 AM Yang Jihong <yangjihong1@huawei.com> wrote:
> >
> > The error code -EACCES is returned when bpf prog is tested in 32-bit environment,
> > This is because bpf_object__relocate modifies the instruction to change memory
> > size to 4 bytes, as shown in the following messages:
> >
> > libbpf: prog 'kfunc_call_test1': relo #2: matching candidate #0 <byte_off> [18342] struct __sk_buff.sk (0:30:0 @ offset 168)
> > libbpf: prog 'kfunc_call_test1': relo #2: patched insn #1 (LDX/ST/STX) off 168 -> 168
> > libbpf: prog 'kfunc_call_test1': relo #2: patched insn #1 (LDX/ST/STX) mem_sz 8 -> 4
> >
> > As a result, the bpf_skb_is_valid_access check fails. For 32-bit architecture,
> > unnecessary checks need to be deleted.
> >
> > Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
> > ---
> >  net/core/filter.c | 2 --
> >  1 file changed, 2 deletions(-)
> >
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index bb0136e7a8e4..eab7ce89740c 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -8269,8 +8269,6 @@ static bool bpf_skb_is_valid_access(int off, int size, enum bpf_access_type type
> >                         return false;
> >                 break;
> >         case offsetof(struct __sk_buff, sk):
> > -               if (type == BPF_WRITE || size != sizeof(__u64))
> > -                       return false;
>
> this probably should be specific to host architecture bitness? I'd
> imagine that size = 4 should be invalid on 64-bit arches (reading half
> of the pointer is bad)

Not quite.
In __sk_buff the field 'sk' is defined as:
__bpf_md_ptr(struct bpf_sock *, sk);
so it's always 64-bit load when bpf prog reads it.
In this case CO_RE shouldn't have been applied to uapi struct __sk_buff.
Andrii Nakryiko Nov. 8, 2022, 1:07 a.m. UTC | #3
On Fri, Nov 4, 2022 at 4:32 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Fri, Nov 4, 2022 at 2:56 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Thu, Nov 3, 2022 at 1:36 AM Yang Jihong <yangjihong1@huawei.com> wrote:
> > >
> > > The error code -EACCES is returned when bpf prog is tested in 32-bit environment,
> > > This is because bpf_object__relocate modifies the instruction to change memory
> > > size to 4 bytes, as shown in the following messages:
> > >
> > > libbpf: prog 'kfunc_call_test1': relo #2: matching candidate #0 <byte_off> [18342] struct __sk_buff.sk (0:30:0 @ offset 168)
> > > libbpf: prog 'kfunc_call_test1': relo #2: patched insn #1 (LDX/ST/STX) off 168 -> 168
> > > libbpf: prog 'kfunc_call_test1': relo #2: patched insn #1 (LDX/ST/STX) mem_sz 8 -> 4
> > >
> > > As a result, the bpf_skb_is_valid_access check fails. For 32-bit architecture,
> > > unnecessary checks need to be deleted.
> > >
> > > Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
> > > ---
> > >  net/core/filter.c | 2 --
> > >  1 file changed, 2 deletions(-)
> > >
> > > diff --git a/net/core/filter.c b/net/core/filter.c
> > > index bb0136e7a8e4..eab7ce89740c 100644
> > > --- a/net/core/filter.c
> > > +++ b/net/core/filter.c
> > > @@ -8269,8 +8269,6 @@ static bool bpf_skb_is_valid_access(int off, int size, enum bpf_access_type type
> > >                         return false;
> > >                 break;
> > >         case offsetof(struct __sk_buff, sk):
> > > -               if (type == BPF_WRITE || size != sizeof(__u64))
> > > -                       return false;
> >
> > this probably should be specific to host architecture bitness? I'd
> > imagine that size = 4 should be invalid on 64-bit arches (reading half
> > of the pointer is bad)
>
> Not quite.
> In __sk_buff the field 'sk' is defined as:
> __bpf_md_ptr(struct bpf_sock *, sk);
> so it's always 64-bit load when bpf prog reads it.
> In this case CO_RE shouldn't have been applied to uapi struct __sk_buff.

Ok, hold on. __bpf_md_ptr just creates a 8-byte sized and aligned
union. It doesn't change the pointer itself in any way:

union {
    struct bpf_sock* sk;
    __u64 :64;
};


It's a 64-bit pointer only because any pointer in the BPF target is
64-bit. But on 32-bit architectures such struct bpf_sock *sk pointer
will *actually* be 4-byte pointer (and __u64 :64 will just make
compiler add 4 bytes of padding after it, effectively), and BPF
verifier will actually generate LDX instruction of BPF_W size (4 byte
load):

        case offsetof(struct __sk_buff, sk):
                *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct sk_buff, sk),
                                      si->dst_reg, si->src_reg,
                                      offsetof(struct sk_buff, sk));
                break;


BPF_FIELD_SIZEOF(struct sk_buff, sk) is 4 for 32-bit kernels.

So while you are correct that it will be 8-byte load from the BPF
side, allowing 4-byte load for such pointers should also be correct.
It's our choice, there is no fundamental limitation why this shouldn't
be the case.

Note also that we do this transformation when fentry/fexit/raw_tp_btf
programs traverse pointers in kernel structures. There pretending like
pointer to an 8-byte value is actually invalid. So libbpf adjusts such
loads to 4-byte loads for CO-RE-relocatable types, which makes it all
work transparently on 32-bit architectures. Context accesses deviate
from that, as they came earlier and we didn't have CO-RE at that time.

So what you are saying is that __sk_buff shouldn't be
CO-RE-relocatable, and yes, that would be good. But I think that's
orthogonal in this case.
Martin KaFai Lau Nov. 8, 2022, 2:28 a.m. UTC | #4
On 11/7/22 5:07 PM, Andrii Nakryiko wrote:
> On Fri, Nov 4, 2022 at 4:32 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
>>
>> On Fri, Nov 4, 2022 at 2:56 PM Andrii Nakryiko
>> <andrii.nakryiko@gmail.com> wrote:
>>>
>>> On Thu, Nov 3, 2022 at 1:36 AM Yang Jihong <yangjihong1@huawei.com> wrote:
>>>>
>>>> The error code -EACCES is returned when bpf prog is tested in 32-bit environment,
>>>> This is because bpf_object__relocate modifies the instruction to change memory
>>>> size to 4 bytes, as shown in the following messages:
>>>>
>>>> libbpf: prog 'kfunc_call_test1': relo #2: matching candidate #0 <byte_off> [18342] struct __sk_buff.sk (0:30:0 @ offset 168)
>>>> libbpf: prog 'kfunc_call_test1': relo #2: patched insn #1 (LDX/ST/STX) off 168 -> 168
>>>> libbpf: prog 'kfunc_call_test1': relo #2: patched insn #1 (LDX/ST/STX) mem_sz 8 -> 4
>>>>
>>>> As a result, the bpf_skb_is_valid_access check fails. For 32-bit architecture,
>>>> unnecessary checks need to be deleted.
>>>>
>>>> Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
>>>> ---
>>>>   net/core/filter.c | 2 --
>>>>   1 file changed, 2 deletions(-)
>>>>
>>>> diff --git a/net/core/filter.c b/net/core/filter.c
>>>> index bb0136e7a8e4..eab7ce89740c 100644
>>>> --- a/net/core/filter.c
>>>> +++ b/net/core/filter.c
>>>> @@ -8269,8 +8269,6 @@ static bool bpf_skb_is_valid_access(int off, int size, enum bpf_access_type type
>>>>                          return false;
>>>>                  break;
>>>>          case offsetof(struct __sk_buff, sk):
>>>> -               if (type == BPF_WRITE || size != sizeof(__u64))
>>>> -                       return false;
>>>
>>> this probably should be specific to host architecture bitness? I'd
>>> imagine that size = 4 should be invalid on 64-bit arches (reading half
>>> of the pointer is bad)
>>
>> Not quite.
>> In __sk_buff the field 'sk' is defined as:
>> __bpf_md_ptr(struct bpf_sock *, sk);
>> so it's always 64-bit load when bpf prog reads it.
>> In this case CO_RE shouldn't have been applied to uapi struct __sk_buff.
> 
> Ok, hold on. __bpf_md_ptr just creates a 8-byte sized and aligned
> union. It doesn't change the pointer itself in any way:
> 
> union {
>      struct bpf_sock* sk;
>      __u64 :64;
> };
> 
> 
> It's a 64-bit pointer only because any pointer in the BPF target is
> 64-bit. But on 32-bit architectures such struct bpf_sock *sk pointer
> will *actually* be 4-byte pointer (and __u64 :64 will just make
> compiler add 4 bytes of padding after it, effectively), and BPF
> verifier will actually generate LDX instruction of BPF_W size (4 byte
> load):
> 
>          case offsetof(struct __sk_buff, sk):
>                  *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct sk_buff, sk),
>                                        si->dst_reg, si->src_reg,
>                                        offsetof(struct sk_buff, sk));
>                  break;
> 
> 
> BPF_FIELD_SIZEOF(struct sk_buff, sk) is 4 for 32-bit kernels.
> 
> So while you are correct that it will be 8-byte load from the BPF
> side, allowing 4-byte load for such pointers should also be correct.
> It's our choice, there is no fundamental limitation why this shouldn't
> be the case.
> 
> Note also that we do this transformation when fentry/fexit/raw_tp_btf
> programs traverse pointers in kernel structures. There pretending like
> pointer to an 8-byte value is actually invalid. So libbpf adjusts such
> loads to 4-byte loads for CO-RE-relocatable types, which makes it all
> work transparently on 32-bit architectures. Context accesses deviate
> from that, as they came earlier and we didn't have CO-RE at that time.
> 
> So what you are saying is that __sk_buff shouldn't be
> CO-RE-relocatable, and yes, that would be good. But I think that's
> orthogonal in this case.

This issue should be from
commit c1ff181ffabc ("selftests/bpf: Extend kfunc selftests") which replaced the 
uapi's bpf.h with vmlinux.h.  One option to unblock this for now is to separate 
those tests that read __sk_buff->sk to its own prog.c and use the uapi's bpf.h 
instead of vmlinux.h.

It would be nice if the bpf-tc program can take 'struct sk_buff *skb' instead of 
'struct __sk_buff *skb' but it will be a separate topic.
diff mbox series

Patch

diff --git a/net/core/filter.c b/net/core/filter.c
index bb0136e7a8e4..eab7ce89740c 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -8269,8 +8269,6 @@  static bool bpf_skb_is_valid_access(int off, int size, enum bpf_access_type type
 			return false;
 		break;
 	case offsetof(struct __sk_buff, sk):
-		if (type == BPF_WRITE || size != sizeof(__u64))
-			return false;
 		info->reg_type = PTR_TO_SOCK_COMMON_OR_NULL;
 		break;
 	case offsetof(struct __sk_buff, tstamp_type):