diff mbox series

vmlinux.h overlap/conflict with network protocol definitions

Message ID 20240404100901.7d6bc10f@hermes.local (mailing list archive)
State RFC
Delegated to: BPF
Headers show
Series vmlinux.h overlap/conflict with network protocol definitions | expand

Checks

Context Check Description
netdev/tree_selection success Guessing tree name failed - patch did not apply, async

Commit Message

Stephen Hemminger April 4, 2024, 5:09 p.m. UTC
I am fixing the use of TC BPF in DPDK to use libbpf and bpftool.
Luca recommended using vmlinux.h to address possible build and
CO:RE issues. But it won't work.

There are missing pieces such as definitions of IPV6 next header
fields (in linux/ipv6.h) and TC actions.

Without major hack surgery, not possible to use vmlinux.h instead.
Since vmlinux.h defines may things that overlap with other headers.

Using:

$ /usr/sbin/bpftool -V
bpftool v7.3.0
using libbpf v1.3
features:

$ uname -r
6.6.15-amd64

The change to BPF program to use vmlinux is:


Resulting build failure is:


~/DPDK/tap $ ninja -C build
ninja: Entering directory `build'
[1/33] Generating drivers/net/tap/bpf/tap_rss.bpf.o with a custom command
FAILED: drivers/net/tap/bpf/tap_rss.o 
/usr/bin/clang -O2 -Wall -Wextra -DTAP_MAX_QUEUES=16 -target bpf -g -c -idirafter /usr/include -idirafter /usr/include/x86_64-linux-gnu ../drivers/net/tap/bpf/tap_rss.c -o drivers/net/tap/bpf/tap_rss.o
../drivers/net/tap/bpf/tap_rss.c:116:8: error: use of undeclared identifier 'IPPROTO_HOPOPTS'
                case IPPROTO_HOPOPTS:
                     ^
../drivers/net/tap/bpf/tap_rss.c:117:8: error: use of undeclared identifier 'IPPROTO_ROUTING'
                case IPPROTO_ROUTING:
                     ^
../drivers/net/tap/bpf/tap_rss.c:118:8: error: use of undeclared identifier 'IPPROTO_DSTOPTS'
                case IPPROTO_DSTOPTS:
                     ^
../drivers/net/tap/bpf/tap_rss.c:126:8: error: use of undeclared identifier 'IPPROTO_FRAGMENT'
                case IPPROTO_FRAGMENT:
                     ^
../drivers/net/tap/bpf/tap_rss.c:210:33: error: use of undeclared identifier 'ETH_P_IP'
        if (skb->protocol == bpf_htons(ETH_P_IP))
                                       ^
../drivers/net/tap/bpf/tap_rss.c:210:33: error: use of undeclared identifier 'ETH_P_IP'
../drivers/net/tap/bpf/tap_rss.c:210:33: error: use of undeclared identifier 'ETH_P_IP'
../drivers/net/tap/bpf/tap_rss.c:210:33: error: use of undeclared identifier 'ETH_P_IP'
../drivers/net/tap/bpf/tap_rss.c:212:38: error: use of undeclared identifier 'ETH_P_IPV6'
        else if (skb->protocol == bpf_htons(ETH_P_IPV6))
                                            ^
../drivers/net/tap/bpf/tap_rss.c:212:38: error: use of undeclared identifier 'ETH_P_IPV6'
../drivers/net/tap/bpf/tap_rss.c:212:38: error: use of undeclared identifier 'ETH_P_IPV6'
../drivers/net/tap/bpf/tap_rss.c:212:38: error: use of undeclared identifier 'ETH_P_IPV6'
../drivers/net/tap/bpf/tap_rss.c:248:10: error: use of undeclared identifier 'TC_ACT_OK'
                return TC_ACT_OK;
                       ^
../drivers/net/tap/bpf/tap_rss.c:252:10: error: use of undeclared identifier 'TC_ACT_OK'
                return TC_ACT_OK;
                       ^
../drivers/net/tap/bpf/tap_rss.c:256:9: error: use of undeclared identifier 'TC_ACT_PIPE'
        return TC_ACT_PIPE;
               ^
15 errors generated.
ninja: build stopped: subcommand failed.

Comments

Yonghong Song April 4, 2024, 6:16 p.m. UTC | #1
On 4/4/24 10:09 AM, Stephen Hemminger wrote:
> I am fixing the use of TC BPF in DPDK to use libbpf and bpftool.
> Luca recommended using vmlinux.h to address possible build and
> CO:RE issues. But it won't work.
>
> There are missing pieces such as definitions of IPV6 next header
> fields (in linux/ipv6.h) and TC actions.
>
> Without major hack surgery, not possible to use vmlinux.h instead.
> Since vmlinux.h defines may things that overlap with other headers.
>
> Using:
>
> $ /usr/sbin/bpftool -V
> bpftool v7.3.0
> using libbpf v1.3
> features:
>
> $ uname -r
> 6.6.15-amd64
>
> The change to BPF program to use vmlinux is:
>
> diff --git a/drivers/net/tap/bpf/tap_rss.c b/drivers/net/tap/bpf/tap_rss.c
> index 888b3bdc24..79f4ee31a1 100644
> --- a/drivers/net/tap/bpf/tap_rss.c
> +++ b/drivers/net/tap/bpf/tap_rss.c
> @@ -2,12 +2,7 @@
>    * Copyright 2017 Mellanox Technologies, Ltd
>    */
>   
> -#include <linux/in.h>
> -#include <linux/if_ether.h>
> -#include <linux/ip.h>
> -#include <linux/ipv6.h>
> -#include <linux/pkt_cls.h>
> -#include <linux/bpf.h>
> +#include "vmlinux.h"
>   
>   #include <bpf/bpf_helpers.h>
>   #include <bpf/bpf_endian.h>
>
> Resulting build failure is:
>
>
> ~/DPDK/tap $ ninja -C build
> ninja: Entering directory `build'
> [1/33] Generating drivers/net/tap/bpf/tap_rss.bpf.o with a custom command
> FAILED: drivers/net/tap/bpf/tap_rss.o
> /usr/bin/clang -O2 -Wall -Wextra -DTAP_MAX_QUEUES=16 -target bpf -g -c -idirafter /usr/include -idirafter /usr/include/x86_64-linux-gnu ../drivers/net/tap/bpf/tap_rss.c -o drivers/net/tap/bpf/tap_rss.o
> ../drivers/net/tap/bpf/tap_rss.c:116:8: error: use of undeclared identifier 'IPPROTO_HOPOPTS'
>                  case IPPROTO_HOPOPTS:
>                       ^
> ../drivers/net/tap/bpf/tap_rss.c:117:8: error: use of undeclared identifier 'IPPROTO_ROUTING'
>                  case IPPROTO_ROUTING:
>                       ^
> ../drivers/net/tap/bpf/tap_rss.c:118:8: error: use of undeclared identifier 'IPPROTO_DSTOPTS'
>                  case IPPROTO_DSTOPTS:
>                       ^
> ../drivers/net/tap/bpf/tap_rss.c:126:8: error: use of undeclared identifier 'IPPROTO_FRAGMENT'
>                  case IPPROTO_FRAGMENT:
>                       ^
> ../drivers/net/tap/bpf/tap_rss.c:210:33: error: use of undeclared identifier 'ETH_P_IP'
>          if (skb->protocol == bpf_htons(ETH_P_IP))
>                                         ^
> ../drivers/net/tap/bpf/tap_rss.c:210:33: error: use of undeclared identifier 'ETH_P_IP'
> ../drivers/net/tap/bpf/tap_rss.c:210:33: error: use of undeclared identifier 'ETH_P_IP'
> ../drivers/net/tap/bpf/tap_rss.c:210:33: error: use of undeclared identifier 'ETH_P_IP'
> ../drivers/net/tap/bpf/tap_rss.c:212:38: error: use of undeclared identifier 'ETH_P_IPV6'
>          else if (skb->protocol == bpf_htons(ETH_P_IPV6))
>                                              ^
> ../drivers/net/tap/bpf/tap_rss.c:212:38: error: use of undeclared identifier 'ETH_P_IPV6'
> ../drivers/net/tap/bpf/tap_rss.c:212:38: error: use of undeclared identifier 'ETH_P_IPV6'
> ../drivers/net/tap/bpf/tap_rss.c:212:38: error: use of undeclared identifier 'ETH_P_IPV6'
> ../drivers/net/tap/bpf/tap_rss.c:248:10: error: use of undeclared identifier 'TC_ACT_OK'
>                  return TC_ACT_OK;
>                         ^
> ../drivers/net/tap/bpf/tap_rss.c:252:10: error: use of undeclared identifier 'TC_ACT_OK'
>                  return TC_ACT_OK;
>                         ^
> ../drivers/net/tap/bpf/tap_rss.c:256:9: error: use of undeclared identifier 'TC_ACT_PIPE'
>          return TC_ACT_PIPE;
>                 ^
> 15 errors generated.
> ninja: build stopped: subcommand failed.

This is a known issue as currently vmlinux.h does not support macros. 
There are some efforts by Edward Zingerman to support this but this has 
not done yet. At the same time, you could have a trivial header file 
like 
https://github.com/torvalds/linux/blob/master/tools/testing/selftests/bpf/progs/bpf_tracing_net.h 
to be used for bpf program and then your bpf program with vmlinux.h can 
have much easier CORE support.
Stephen Hemminger April 4, 2024, 6:27 p.m. UTC | #2
On Thu, 4 Apr 2024 11:16:32 -0700
Yonghong Song <yonghong.song@linux.dev> wrote:

> On 4/4/24 10:09 AM, Stephen Hemminger wrote:
> > I am fixing the use of TC BPF in DPDK to use libbpf and bpftool.
> > Luca recommended using vmlinux.h to address possible build and
> > CO:RE issues. But it won't work.
> >
> > There are missing pieces such as definitions of IPV6 next header
> > fields (in linux/ipv6.h) and TC actions.
> >
> > Without major hack surgery, not possible to use vmlinux.h instead.
> > Since vmlinux.h defines may things that overlap with other headers.
> >
> > Using:
> >
> > $ /usr/sbin/bpftool -V
> > bpftool v7.3.0
> > using libbpf v1.3
> > features:
> >
> > $ uname -r
> > 6.6.15-amd64
> >
> > The change to BPF program to use vmlinux is:
> >
> > diff --git a/drivers/net/tap/bpf/tap_rss.c b/drivers/net/tap/bpf/tap_rss.c
> > index 888b3bdc24..79f4ee31a1 100644
> > --- a/drivers/net/tap/bpf/tap_rss.c
> > +++ b/drivers/net/tap/bpf/tap_rss.c
> > @@ -2,12 +2,7 @@
> >    * Copyright 2017 Mellanox Technologies, Ltd
> >    */
> >   
> > -#include <linux/in.h>
> > -#include <linux/if_ether.h>
> > -#include <linux/ip.h>
> > -#include <linux/ipv6.h>
> > -#include <linux/pkt_cls.h>
> > -#include <linux/bpf.h>
> > +#include "vmlinux.h"
> >   
> >   #include <bpf/bpf_helpers.h>
> >   #include <bpf/bpf_endian.h>
> >
> > Resulting build failure is:
> >
> >
> > ~/DPDK/tap $ ninja -C build
> > ninja: Entering directory `build'
> > [1/33] Generating drivers/net/tap/bpf/tap_rss.bpf.o with a custom command
> > FAILED: drivers/net/tap/bpf/tap_rss.o
> > /usr/bin/clang -O2 -Wall -Wextra -DTAP_MAX_QUEUES=16 -target bpf -g -c -idirafter /usr/include -idirafter /usr/include/x86_64-linux-gnu ../drivers/net/tap/bpf/tap_rss.c -o drivers/net/tap/bpf/tap_rss.o
> > ../drivers/net/tap/bpf/tap_rss.c:116:8: error: use of undeclared identifier 'IPPROTO_HOPOPTS'
> >                  case IPPROTO_HOPOPTS:
> >                       ^
> > ../drivers/net/tap/bpf/tap_rss.c:117:8: error: use of undeclared identifier 'IPPROTO_ROUTING'
> >                  case IPPROTO_ROUTING:
> >                       ^
> > ../drivers/net/tap/bpf/tap_rss.c:118:8: error: use of undeclared identifier 'IPPROTO_DSTOPTS'
> >                  case IPPROTO_DSTOPTS:
> >                       ^
> > ../drivers/net/tap/bpf/tap_rss.c:126:8: error: use of undeclared identifier 'IPPROTO_FRAGMENT'
> >                  case IPPROTO_FRAGMENT:
> >                       ^
> > ../drivers/net/tap/bpf/tap_rss.c:210:33: error: use of undeclared identifier 'ETH_P_IP'
> >          if (skb->protocol == bpf_htons(ETH_P_IP))
> >                                         ^
> > ../drivers/net/tap/bpf/tap_rss.c:210:33: error: use of undeclared identifier 'ETH_P_IP'
> > ../drivers/net/tap/bpf/tap_rss.c:210:33: error: use of undeclared identifier 'ETH_P_IP'
> > ../drivers/net/tap/bpf/tap_rss.c:210:33: error: use of undeclared identifier 'ETH_P_IP'
> > ../drivers/net/tap/bpf/tap_rss.c:212:38: error: use of undeclared identifier 'ETH_P_IPV6'
> >          else if (skb->protocol == bpf_htons(ETH_P_IPV6))
> >                                              ^
> > ../drivers/net/tap/bpf/tap_rss.c:212:38: error: use of undeclared identifier 'ETH_P_IPV6'
> > ../drivers/net/tap/bpf/tap_rss.c:212:38: error: use of undeclared identifier 'ETH_P_IPV6'
> > ../drivers/net/tap/bpf/tap_rss.c:212:38: error: use of undeclared identifier 'ETH_P_IPV6'
> > ../drivers/net/tap/bpf/tap_rss.c:248:10: error: use of undeclared identifier 'TC_ACT_OK'
> >                  return TC_ACT_OK;
> >                         ^
> > ../drivers/net/tap/bpf/tap_rss.c:252:10: error: use of undeclared identifier 'TC_ACT_OK'
> >                  return TC_ACT_OK;
> >                         ^
> > ../drivers/net/tap/bpf/tap_rss.c:256:9: error: use of undeclared identifier 'TC_ACT_PIPE'
> >          return TC_ACT_PIPE;
> >                 ^
> > 15 errors generated.
> > ninja: build stopped: subcommand failed.  
> 
> This is a known issue as currently vmlinux.h does not support macros. 
> There are some efforts by Edward Zingerman to support this but this has 
> not done yet. At the same time, you could have a trivial header file 
> like 
> https://github.com/torvalds/linux/blob/master/tools/testing/selftests/bpf/progs/bpf_tracing_net.h 
> to be used for bpf program and then your bpf program with vmlinux.h can 
> have much easier CORE support.


That is an example of header surgery which I would rather avoid having to carry
as long term technical debt baggage.
Andrii Nakryiko April 4, 2024, 9:45 p.m. UTC | #3
On Thu, Apr 4, 2024 at 11:27 AM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Thu, 4 Apr 2024 11:16:32 -0700
> Yonghong Song <yonghong.song@linux.dev> wrote:
>
> > On 4/4/24 10:09 AM, Stephen Hemminger wrote:
> > > I am fixing the use of TC BPF in DPDK to use libbpf and bpftool.
> > > Luca recommended using vmlinux.h to address possible build and
> > > CO:RE issues. But it won't work.
> > >
> > > There are missing pieces such as definitions of IPV6 next header
> > > fields (in linux/ipv6.h) and TC actions.
> > >
> > > Without major hack surgery, not possible to use vmlinux.h instead.
> > > Since vmlinux.h defines may things that overlap with other headers.
> > >
> > > Using:

[...]

> > This is a known issue as currently vmlinux.h does not support macros.
> > There are some efforts by Edward Zingerman to support this but this has
> > not done yet. At the same time, you could have a trivial header file
> > like
> > https://github.com/torvalds/linux/blob/master/tools/testing/selftests/bpf/progs/bpf_tracing_net.h
> > to be used for bpf program and then your bpf program with vmlinux.h can
> > have much easier CORE support.
>
>
> That is an example of header surgery which I would rather avoid having to carry
> as long term technical debt baggage.
>

What's your ultimate goal? As Yonghong said, vmlinux.h is not
compatible with other headers. So you have to pick either using
vmlinux.h as a base + adding missing #define's (because those are not
recorded in types, so can't be put into vmlinux.h), or not use
vmlinux.h, use linux UAPI/internal headers and then use explicit CO-RE
helpers/attributes to make your application CO-RE-relocatable.

It's not clear from your original email why exactly you wanted to
switch to vmlinux.h in the first place.
Stephen Hemminger April 5, 2024, 12:52 a.m. UTC | #4
On Thu, 4 Apr 2024 14:45:04 -0700
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> > > This is a known issue as currently vmlinux.h does not support macros.
> > > There are some efforts by Edward Zingerman to support this but this has
> > > not done yet. At the same time, you could have a trivial header file
> > > like
> > > https://github.com/torvalds/linux/blob/master/tools/testing/selftests/bpf/progs/bpf_tracing_net.h
> > > to be used for bpf program and then your bpf program with vmlinux.h can
> > > have much easier CORE support.  
> >
> >
> > That is an example of header surgery which I would rather avoid having to carry
> > as long term technical debt baggage.
> >  
> 
> What's your ultimate goal? As Yonghong said, vmlinux.h is not
> compatible with other headers. So you have to pick either using
> vmlinux.h as a base + adding missing #define's (because those are not
> recorded in types, so can't be put into vmlinux.h), or not use
> vmlinux.h, use linux UAPI/internal headers and then use explicit CO-RE
> helpers/attributes to make your application CO-RE-relocatable.
> 
> It's not clear from your original email why exactly you wanted to
> switch to vmlinux.h in the first place.

Some backstory. There is not an existing TC filter for this, so the
original developer had the idea of using BPF to do it. 

The program is a small BPF program to implement a TC filter that looks at 
SKB and does mapping to queue based on L3 (or L3/L4) header. So not heavily dependent
on kernel data structure, but sk_buff is not necessarily stable; actual layout
depends on kernel config.

But the evolution of BPF has made the old code unusable. I am trying to get it working again,
and cleanup to modern BPF by using BPF skeleton code.

Luca was one who had the suggestion about vmlinux.

> Using bpftool to generate the header at build time is a bit icky,
> because it will look at sysfs on the build system, which is from the
> running kernel. But a build system's kernel might be some ancient LTS,
> and even be a completely different kconfig/build/distro from the actual
> runtime one.
> 
> We have ran in the same problem in systemd recently, and the solution
> is to have distros publish the vmlinux.h together with the kernel
> image/headers, that way we can rely on the fact that by build-depending
> on the right kernel package we get exactly the generated vmlinux.h that
> we want. This has already happened in Centos, Debian, Fedora and Arch,
> and I am trying to get Ubuntu onboard too.
> 
> The annoying thing is that every distro packages differently, so the
> path needs to be configurable with a meson option.
> 
> Feel free to pilfer the systemd meson glue:
> 
> https://github.com/systemd/systemd/pull/26826/commits/d917079e7e320aa281fc4ad6f8073b0814b9cb13
> 
> It's of course file to go to the runtime kernel if no vmlinux.h is
> specified, as a fallback, which is going to be useful for developers
> machines.
> 
> -- 
> Kind regards,
> Luca Boccassi
Andrii Nakryiko April 5, 2024, 5:31 p.m. UTC | #5
On Thu, Apr 4, 2024 at 5:53 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Thu, 4 Apr 2024 14:45:04 -0700
> Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> > > > This is a known issue as currently vmlinux.h does not support macros.
> > > > There are some efforts by Edward Zingerman to support this but this has
> > > > not done yet. At the same time, you could have a trivial header file
> > > > like
> > > > https://github.com/torvalds/linux/blob/master/tools/testing/selftests/bpf/progs/bpf_tracing_net.h
> > > > to be used for bpf program and then your bpf program with vmlinux.h can
> > > > have much easier CORE support.
> > >
> > >
> > > That is an example of header surgery which I would rather avoid having to carry
> > > as long term technical debt baggage.
> > >
> >
> > What's your ultimate goal? As Yonghong said, vmlinux.h is not
> > compatible with other headers. So you have to pick either using
> > vmlinux.h as a base + adding missing #define's (because those are not
> > recorded in types, so can't be put into vmlinux.h), or not use
> > vmlinux.h, use linux UAPI/internal headers and then use explicit CO-RE
> > helpers/attributes to make your application CO-RE-relocatable.
> >
> > It's not clear from your original email why exactly you wanted to
> > switch to vmlinux.h in the first place.
>
> Some backstory. There is not an existing TC filter for this, so the
> original developer had the idea of using BPF to do it.
>
> The program is a small BPF program to implement a TC filter that looks at
> SKB and does mapping to queue based on L3 (or L3/L4) header. So not heavily dependent
> on kernel data structure, but sk_buff is not necessarily stable; actual layout
> depends on kernel config.
>

If it's only a few fields from sk_buff that you need, you can define
your own minimal sk_buff definition with
__attribute__((preserve_access_index)) added to it. You don't have to
declare fields in the right order, just make sure that field types
match. E.g., something like:

struct sk_buff {
  unsigned char *head;
  unsigned char *data;
  struct sock *sock;
} __attribute__((preserve_access_index));

You can call it `struct sk_buff___mine` if you already include
sk_buff, to avoid the conflict. Libbpf will still understand that it
should match it to struct sk_buff in the kernel.

Alternatively, you can just use BPF_CORE_READ() macro on types you get
from kernel headers, even if they don't have that
preserve_access_index attribute.

Or, you can just use __builtin_preserve_access_index(&skb->len) to
access sk_buff's fields in CO-RE-relocatable way.

Or, you can have entire block within which all fields accesses will be
CO-RE-relocatable:

int len;

__builtin_preserve_access_index(({
    len = skb->len;
    /* other skb accesses here as well */
}));

Many ways to have bolted on CO-RE even without vmlinux.h. vmlinux.h is
convenient (apart from lack of #defines, but that's an orthogonal
problem), but by no means required for BPF CO-RE.


> But the evolution of BPF has made the old code unusable. I am trying to get it working again,
> and cleanup to modern BPF by using BPF skeleton code.
>

That's a good idea independently from vmlinux.h, yep.

> Luca was one who had the suggestion about vmlinux.
>
> > Using bpftool to generate the header at build time is a bit icky,
> > because it will look at sysfs on the build system, which is from the
> > running kernel. But a build system's kernel might be some ancient LTS,
> > and even be a completely different kconfig/build/distro from the actual
> > runtime one.
> >
> > We have ran in the same problem in systemd recently, and the solution
> > is to have distros publish the vmlinux.h together with the kernel
> > image/headers, that way we can rely on the fact that by build-depending
> > on the right kernel package we get exactly the generated vmlinux.h that
> > we want. This has already happened in Centos, Debian, Fedora and Arch,
> > and I am trying to get Ubuntu onboard too.
> >
> > The annoying thing is that every distro packages differently, so the
> > path needs to be configurable with a meson option.
> >
> > Feel free to pilfer the systemd meson glue:
> >
> > https://github.com/systemd/systemd/pull/26826/commits/d917079e7e320aa281fc4ad6f8073b0814b9cb13
> >
> > It's of course file to go to the runtime kernel if no vmlinux.h is
> > specified, as a fallback, which is going to be useful for developers
> > machines.
> >
> > --
> > Kind regards,
> > Luca Boccassi
>
>
>
Stephen Hemminger April 5, 2024, 7:37 p.m. UTC | #6
On Fri, 5 Apr 2024 10:31:46 -0700
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> On Thu, Apr 4, 2024 at 5:53 PM Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> >
> > On Thu, 4 Apr 2024 14:45:04 -0700
> > Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> >  
> > > > > This is a known issue as currently vmlinux.h does not support macros.
> > > > > There are some efforts by Edward Zingerman to support this but this has
> > > > > not done yet. At the same time, you could have a trivial header file
> > > > > like
> > > > > https://github.com/torvalds/linux/blob/master/tools/testing/selftests/bpf/progs/bpf_tracing_net.h
> > > > > to be used for bpf program and then your bpf program with vmlinux.h can
> > > > > have much easier CORE support.  
> > > >
> > > >
> > > > That is an example of header surgery which I would rather avoid having to carry
> > > > as long term technical debt baggage.
> > > >  
> > >
> > > What's your ultimate goal? As Yonghong said, vmlinux.h is not
> > > compatible with other headers. So you have to pick either using
> > > vmlinux.h as a base + adding missing #define's (because those are not
> > > recorded in types, so can't be put into vmlinux.h), or not use
> > > vmlinux.h, use linux UAPI/internal headers and then use explicit CO-RE
> > > helpers/attributes to make your application CO-RE-relocatable.
> > >
> > > It's not clear from your original email why exactly you wanted to
> > > switch to vmlinux.h in the first place.  
> >
> > Some backstory. There is not an existing TC filter for this, so the
> > original developer had the idea of using BPF to do it.
> >
> > The program is a small BPF program to implement a TC filter that looks at
> > SKB and does mapping to queue based on L3 (or L3/L4) header. So not heavily dependent
> > on kernel data structure, but sk_buff is not necessarily stable; actual layout
> > depends on kernel config.
> >  
> 
> If it's only a few fields from sk_buff that you need, you can define
> your own minimal sk_buff definition with
> __attribute__((preserve_access_index)) added to it. You don't have to
> declare fields in the right order, just make sure that field types
> match. E.g., something like:
> 
> struct sk_buff {
>   unsigned char *head;
>   unsigned char *data;
>   struct sock *sock;
> } __attribute__((preserve_access_index));
> 
> You can call it `struct sk_buff___mine` if you already include
> sk_buff, to avoid the conflict. Libbpf will still understand that it
> should match it to struct sk_buff in the kernel.
> 
> Alternatively, you can just use BPF_CORE_READ() macro on types you get
> from kernel headers, even if they don't have that
> preserve_access_index attribute.
> 
> Or, you can just use __builtin_preserve_access_index(&skb->len) to
> access sk_buff's fields in CO-RE-relocatable way.
> 
> Or, you can have entire block within which all fields accesses will be
> CO-RE-relocatable:
> 
> int len;
> 
> __builtin_preserve_access_index(({
>     len = skb->len;
>     /* other skb accesses here as well */
> }));
> 
> Many ways to have bolted on CO-RE even without vmlinux.h. vmlinux.h is
> convenient (apart from lack of #defines, but that's an orthogonal
> problem), but by no means required for BPF CO-RE.
> 


The skbuff pointer is passed to the TC program. So BPF_CORE_READ doesn't
make sense here.  The biggest concern is that kernel config of the build
machine may not match the kernel config of the eventual target, and some
of the fields of the sk_buff are hidden based on #ifdef and may change.
Andrii Nakryiko April 5, 2024, 8 p.m. UTC | #7
On Fri, Apr 5, 2024 at 12:38 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Fri, 5 Apr 2024 10:31:46 -0700
> Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> > On Thu, Apr 4, 2024 at 5:53 PM Stephen Hemminger
> > <stephen@networkplumber.org> wrote:
> > >
> > > On Thu, 4 Apr 2024 14:45:04 -0700
> > > Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > >
> > > > > > This is a known issue as currently vmlinux.h does not support macros.
> > > > > > There are some efforts by Edward Zingerman to support this but this has
> > > > > > not done yet. At the same time, you could have a trivial header file
> > > > > > like
> > > > > > https://github.com/torvalds/linux/blob/master/tools/testing/selftests/bpf/progs/bpf_tracing_net.h
> > > > > > to be used for bpf program and then your bpf program with vmlinux.h can
> > > > > > have much easier CORE support.
> > > > >
> > > > >
> > > > > That is an example of header surgery which I would rather avoid having to carry
> > > > > as long term technical debt baggage.
> > > > >
> > > >
> > > > What's your ultimate goal? As Yonghong said, vmlinux.h is not
> > > > compatible with other headers. So you have to pick either using
> > > > vmlinux.h as a base + adding missing #define's (because those are not
> > > > recorded in types, so can't be put into vmlinux.h), or not use
> > > > vmlinux.h, use linux UAPI/internal headers and then use explicit CO-RE
> > > > helpers/attributes to make your application CO-RE-relocatable.
> > > >
> > > > It's not clear from your original email why exactly you wanted to
> > > > switch to vmlinux.h in the first place.
> > >
> > > Some backstory. There is not an existing TC filter for this, so the
> > > original developer had the idea of using BPF to do it.
> > >
> > > The program is a small BPF program to implement a TC filter that looks at
> > > SKB and does mapping to queue based on L3 (or L3/L4) header. So not heavily dependent
> > > on kernel data structure, but sk_buff is not necessarily stable; actual layout
> > > depends on kernel config.
> > >
> >
> > If it's only a few fields from sk_buff that you need, you can define
> > your own minimal sk_buff definition with
> > __attribute__((preserve_access_index)) added to it. You don't have to
> > declare fields in the right order, just make sure that field types
> > match. E.g., something like:
> >
> > struct sk_buff {
> >   unsigned char *head;
> >   unsigned char *data;
> >   struct sock *sock;
> > } __attribute__((preserve_access_index));
> >
> > You can call it `struct sk_buff___mine` if you already include
> > sk_buff, to avoid the conflict. Libbpf will still understand that it
> > should match it to struct sk_buff in the kernel.
> >
> > Alternatively, you can just use BPF_CORE_READ() macro on types you get
> > from kernel headers, even if they don't have that
> > preserve_access_index attribute.
> >
> > Or, you can just use __builtin_preserve_access_index(&skb->len) to
> > access sk_buff's fields in CO-RE-relocatable way.
> >
> > Or, you can have entire block within which all fields accesses will be
> > CO-RE-relocatable:
> >
> > int len;
> >
> > __builtin_preserve_access_index(({
> >     len = skb->len;
> >     /* other skb accesses here as well */
> > }));
> >
> > Many ways to have bolted on CO-RE even without vmlinux.h. vmlinux.h is
> > convenient (apart from lack of #defines, but that's an orthogonal
> > problem), but by no means required for BPF CO-RE.
> >
>
>
> The skbuff pointer is passed to the TC program. So BPF_CORE_READ doesn't
> make sense here.  The biggest concern is that kernel config of the build
> machine may not match the kernel config of the eventual target, and some
> of the fields of the sk_buff are hidden based on #ifdef and may change.

And everything else I wrote?..

Regardless, I think at this point if you want to get help, you
probably should provide the source code.
diff mbox series

Patch

diff --git a/drivers/net/tap/bpf/tap_rss.c b/drivers/net/tap/bpf/tap_rss.c
index 888b3bdc24..79f4ee31a1 100644
--- a/drivers/net/tap/bpf/tap_rss.c
+++ b/drivers/net/tap/bpf/tap_rss.c
@@ -2,12 +2,7 @@ 
  * Copyright 2017 Mellanox Technologies, Ltd
  */
 
-#include <linux/in.h>
-#include <linux/if_ether.h>
-#include <linux/ip.h>
-#include <linux/ipv6.h>
-#include <linux/pkt_cls.h>
-#include <linux/bpf.h>
+#include "vmlinux.h"
 
 #include <bpf/bpf_helpers.h>
 #include <bpf/bpf_endian.h>