mbox series

[bpf-next,00/11] xdp: hints via kfuncs

Message ID 20221115030210.3159213-1-sdf@google.com (mailing list archive)
Headers show
Series xdp: hints via kfuncs | expand

Message

Stanislav Fomichev Nov. 15, 2022, 3:01 a.m. UTC
Please see the first patch in the series for the overall
design and use-cases.

Changes since last RFC:

- drop ice/bnxt example implementation (Alexander)

  -ENOHARDWARE to test

- fix/test mlx4 implementation

  Confirmed that I get reasonable looking timestamp.
  The last patch in the series is the small xsk program that can
  be used to dump incoming metadata.

- bpf_push64/bpf_pop64 (Alexei)

  x86_64+arm64(untested)+disassembler

- struct xdp_to_skb_metadata -> struct xdp_skb_metadata (Toke)

  s/xdp_to_skb/xdp_skb/

- Documentation/bpf/xdp-rx-metadata.rst

  Documents functionality, assumptions and limitations.

- bpf_xdp_metadata_export_to_skb returns true/false (Martin)

  Plus xdp_md->skb_metadata field to access it.

- BPF_F_XDP_HAS_METADATA flag (Toke/Martin)

  Drop magic, use the flag instead.

- drop __randomize_layout

  Not sure it's possible to sanely expose it via UAPI. Because every
  .o potentially gets its own randomized layout, test_progs
  refuses to link.

- remove __net_timestamp in veth driver (John/Jesper)

  Instead, calling ktime_get from the kfunc; enough for the selftests.

Future work on RX side:

- Support more devices besides veth and mlx4
- Support more metadata besides RX timestamp.
- Convert skb_metadata_set() callers to xdp_convert_skb_metadata()
  which handles extra xdp_skb_metadata

Prior art (to record pros/cons for different approaches):

- Stable UAPI approach:
  https://lore.kernel.org/bpf/20220628194812.1453059-1-alexandr.lobakin@intel.com/
- Metadata+BTF_ID appoach:
  https://lore.kernel.org/bpf/166256538687.1434226.15760041133601409770.stgit@firesoul/
- kfuncs v2 RFC:
  https://lore.kernel.org/bpf/20221027200019.4106375-1-sdf@google.com/
- kfuncs v1 RFC:
  https://lore.kernel.org/bpf/20221104032532.1615099-1-sdf@google.com/

Cc: John Fastabend <john.fastabend@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Anatoly Burakov <anatoly.burakov@intel.com>
Cc: Alexander Lobakin <alexandr.lobakin@intel.com>
Cc: Magnus Karlsson <magnus.karlsson@gmail.com>
Cc: Maryam Tahhan <mtahhan@redhat.com>
Cc: xdp-hints@xdp-project.net
Cc: netdev@vger.kernel.org

Stanislav Fomichev (11):
  bpf: Document XDP RX metadata
  bpf: Introduce bpf_patch
  bpf: Support inlined/unrolled kfuncs for xdp metadata
  bpf: Implement hidden BPF_PUSH64 and BPF_POP64 instructions
  veth: Support rx timestamp metadata for xdp
  xdp: Carry over xdp metadata into skb context
  selftests/bpf: Verify xdp_metadata xdp->af_xdp path
  selftests/bpf: Verify xdp_metadata xdp->skb path
  mlx4: Introduce mlx4_xdp_buff wrapper for xdp_buff
  mxl4: Support rx timestamp metadata for xdp
  selftests/bpf: Simple program to dump XDP RX metadata

 Documentation/bpf/kfuncs.rst                  |   8 +
 Documentation/bpf/xdp-rx-metadata.rst         | 109 +++++
 arch/arm64/net/bpf_jit_comp.c                 |   8 +
 arch/x86/net/bpf_jit_comp.c                   |   8 +
 .../net/ethernet/mellanox/mlx4/en_netdev.c    |   2 +
 drivers/net/ethernet/mellanox/mlx4/en_rx.c    |  68 ++-
 drivers/net/veth.c                            |  22 +-
 include/linux/bpf.h                           |   1 +
 include/linux/bpf_patch.h                     |  29 ++
 include/linux/btf.h                           |   1 +
 include/linux/btf_ids.h                       |   4 +
 include/linux/filter.h                        |  23 +
 include/linux/mlx4/device.h                   |   7 +
 include/linux/netdevice.h                     |   5 +
 include/linux/skbuff.h                        |   4 +
 include/net/xdp.h                             |  41 ++
 include/uapi/linux/bpf.h                      |  12 +
 kernel/bpf/Makefile                           |   2 +-
 kernel/bpf/bpf_patch.c                        |  77 +++
 kernel/bpf/disasm.c                           |   6 +
 kernel/bpf/syscall.c                          |  28 +-
 kernel/bpf/verifier.c                         |  80 ++++
 net/core/dev.c                                |   7 +
 net/core/filter.c                             |  40 ++
 net/core/skbuff.c                             |  20 +
 net/core/xdp.c                                | 184 +++++++-
 tools/include/uapi/linux/bpf.h                |  12 +
 tools/testing/selftests/bpf/.gitignore        |   1 +
 tools/testing/selftests/bpf/DENYLIST.s390x    |   1 +
 tools/testing/selftests/bpf/Makefile          |   8 +-
 .../selftests/bpf/prog_tests/xdp_metadata.c   | 440 ++++++++++++++++++
 .../selftests/bpf/progs/xdp_hw_metadata.c     |  99 ++++
 .../selftests/bpf/progs/xdp_metadata.c        | 114 +++++
 tools/testing/selftests/bpf/xdp_hw_metadata.c | 404 ++++++++++++++++
 tools/testing/selftests/bpf/xdp_hw_metadata.h |   6 +
 35 files changed, 1856 insertions(+), 25 deletions(-)
 create mode 100644 Documentation/bpf/xdp-rx-metadata.rst
 create mode 100644 include/linux/bpf_patch.h
 create mode 100644 kernel/bpf/bpf_patch.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_metadata.c
 create mode 100644 tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
 create mode 100644 tools/testing/selftests/bpf/progs/xdp_metadata.c
 create mode 100644 tools/testing/selftests/bpf/xdp_hw_metadata.c
 create mode 100644 tools/testing/selftests/bpf/xdp_hw_metadata.h

Comments

Toke Høiland-Jørgensen Nov. 15, 2022, 3:54 p.m. UTC | #1
Stanislav Fomichev <sdf@google.com> writes:

> - drop __randomize_layout
>
>   Not sure it's possible to sanely expose it via UAPI. Because every
>   .o potentially gets its own randomized layout, test_progs
>   refuses to link.

So this won't work if the struct is in a kernel-supplied UAPI header
(which would include the __randomize_layout tag). But if it's *not* in a
UAPI header it should still be included in a stable form (i.e., without
the randomize tag) in vmlinux.h, right? Which would be the point:
consumers would be forced to read it from there and do CO-RE on it...

-Toke
Stanislav Fomichev Nov. 15, 2022, 6:37 p.m. UTC | #2
On Tue, Nov 15, 2022 at 7:54 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Stanislav Fomichev <sdf@google.com> writes:
>
> > - drop __randomize_layout
> >
> >   Not sure it's possible to sanely expose it via UAPI. Because every
> >   .o potentially gets its own randomized layout, test_progs
> >   refuses to link.
>
> So this won't work if the struct is in a kernel-supplied UAPI header
> (which would include the __randomize_layout tag). But if it's *not* in a
> UAPI header it should still be included in a stable form (i.e., without
> the randomize tag) in vmlinux.h, right? Which would be the point:
> consumers would be forced to read it from there and do CO-RE on it...

So you're suggesting something like the following in the uapi header?

#ifndef __KERNEL__
#define __randomize_layout
#endif

?

Let me try to add some padding arguments to xdp_skb_metadata plus the
above to see how it goes.
Toke Høiland-Jørgensen Nov. 15, 2022, 10:31 p.m. UTC | #3
Stanislav Fomichev <sdf@google.com> writes:

> On Tue, Nov 15, 2022 at 7:54 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Stanislav Fomichev <sdf@google.com> writes:
>>
>> > - drop __randomize_layout
>> >
>> >   Not sure it's possible to sanely expose it via UAPI. Because every
>> >   .o potentially gets its own randomized layout, test_progs
>> >   refuses to link.
>>
>> So this won't work if the struct is in a kernel-supplied UAPI header
>> (which would include the __randomize_layout tag). But if it's *not* in a
>> UAPI header it should still be included in a stable form (i.e., without
>> the randomize tag) in vmlinux.h, right? Which would be the point:
>> consumers would be forced to read it from there and do CO-RE on it...
>
> So you're suggesting something like the following in the uapi header?
>
> #ifndef __KERNEL__
> #define __randomize_layout
> #endif
>
> ?

I actually just meant "don't put struct xdp_metadata in an UAPI header
file at all". However, I can see how that complicates having the
skb_metadata pointer in struct xdp_md, so if the above works, that's
fine with me as well :)

> Let me try to add some padding arguments to xdp_skb_metadata plus the
> above to see how it goes.

Cool!

-Toke
Alexei Starovoitov Nov. 15, 2022, 10:54 p.m. UTC | #4
On Tue, Nov 15, 2022 at 10:38 AM Stanislav Fomichev <sdf@google.com> wrote:
>
> On Tue, Nov 15, 2022 at 7:54 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >
> > Stanislav Fomichev <sdf@google.com> writes:
> >
> > > - drop __randomize_layout
> > >
> > >   Not sure it's possible to sanely expose it via UAPI. Because every
> > >   .o potentially gets its own randomized layout, test_progs
> > >   refuses to link.
> >
> > So this won't work if the struct is in a kernel-supplied UAPI header
> > (which would include the __randomize_layout tag). But if it's *not* in a
> > UAPI header it should still be included in a stable form (i.e., without
> > the randomize tag) in vmlinux.h, right? Which would be the point:
> > consumers would be forced to read it from there and do CO-RE on it...
>
> So you're suggesting something like the following in the uapi header?
>
> #ifndef __KERNEL__
> #define __randomize_layout
> #endif
>

1.
__randomize_layout in uapi header makes no sense.

2.
It's supported by gcc plugin and afaik that plugin is broken
vs debug info, so dwarf is broken, hence BTF is broken too,
and CO-RE doesn't work on kernels compiled with that gcc plugin.
Toke Høiland-Jørgensen Nov. 15, 2022, 11:13 p.m. UTC | #5
Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Tue, Nov 15, 2022 at 10:38 AM Stanislav Fomichev <sdf@google.com> wrote:
>>
>> On Tue, Nov 15, 2022 at 7:54 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> >
>> > Stanislav Fomichev <sdf@google.com> writes:
>> >
>> > > - drop __randomize_layout
>> > >
>> > >   Not sure it's possible to sanely expose it via UAPI. Because every
>> > >   .o potentially gets its own randomized layout, test_progs
>> > >   refuses to link.
>> >
>> > So this won't work if the struct is in a kernel-supplied UAPI header
>> > (which would include the __randomize_layout tag). But if it's *not* in a
>> > UAPI header it should still be included in a stable form (i.e., without
>> > the randomize tag) in vmlinux.h, right? Which would be the point:
>> > consumers would be forced to read it from there and do CO-RE on it...
>>
>> So you're suggesting something like the following in the uapi header?
>>
>> #ifndef __KERNEL__
>> #define __randomize_layout
>> #endif
>>
>
> 1.
> __randomize_layout in uapi header makes no sense.

I agree, which is why I wanted it to be only in vmlinux.h...

> 2.
> It's supported by gcc plugin and afaik that plugin is broken
> vs debug info, so dwarf is broken, hence BTF is broken too,
> and CO-RE doesn't work on kernels compiled with that gcc plugin.

...however this one seems a deal breaker. Ah well, too bad, seemed like
a neat trick to enforce CO-RE :(

-Toke