diff mbox series

[bpf-next,5/5] libbpf: add selftests for TC-BPF API

Message ID 20210325120020.236504-6-memxor@gmail.com (mailing list archive)
State New
Headers show
Series libbpf: Add TC-BPF API | expand

Commit Message

Kumar Kartikeya Dwivedi March 25, 2021, noon UTC
This adds some basic tests for the low level bpf_tc_* API and its
bpf_program__attach_tc_* wrapper on top.

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 .../selftests/bpf/prog_tests/test_tc_bpf.c    | 261 ++++++++++++++++++
 .../selftests/bpf/progs/test_tc_bpf_kern.c    |  18 ++
 2 files changed, 279 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/test_tc_bpf.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_tc_bpf_kern.c

Comments

Alexei Starovoitov March 27, 2021, 2:15 a.m. UTC | #1
On Thu, Mar 25, 2021 at 05:30:03PM +0530, Kumar Kartikeya Dwivedi wrote:
> This adds some basic tests for the low level bpf_tc_* API and its
> bpf_program__attach_tc_* wrapper on top.

*_block() apis from patch 3 and 4 are not covered by this selftest.
Why were they added ? And how were they tested?

Pls trim your cc. bpf@vger and netdev@vger would have been enough.

My main concern with this set is that it adds netlink apis to libbpf while
we already agreed to split xdp manipulation pieces out of libbpf.
It would be odd to add tc apis now only to split them later.
I think it's better to start with new library for tc/xdp and have
libbpf as a dependency on that new lib.
For example we can add it as subdir in tools/lib/bpf/.

Similarly I think integerating static linking into libbpf was a mistake.
It should be a sub library as well.

If we end up with core libbpf and ten sublibs for tc, xdp, af_xdp, linking,
whatever else the users would appreciate that we don't shove single libbpf
to them with a ton of features that they might never use.
Toke Høiland-Jørgensen March 27, 2021, 3:17 p.m. UTC | #2
Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Thu, Mar 25, 2021 at 05:30:03PM +0530, Kumar Kartikeya Dwivedi wrote:
>> This adds some basic tests for the low level bpf_tc_* API and its
>> bpf_program__attach_tc_* wrapper on top.
>
> *_block() apis from patch 3 and 4 are not covered by this selftest.
> Why were they added ? And how were they tested?
>
> Pls trim your cc. bpf@vger and netdev@vger would have been enough.
>
> My main concern with this set is that it adds netlink apis to libbpf while
> we already agreed to split xdp manipulation pieces out of libbpf.
> It would be odd to add tc apis now only to split them later.

We're not removing the ability to attach an XDP program via netlink from
libxdp, though. This is the equivalent for TC: the minimum support to
attach a program, and if you want to do more, you pull in another
library or roll your own.

I'm fine with cutting out more stuff and making this even more minimal
(e.g., remove the block stuff and only support attach/detach on ifaces),
but we figured we'd err on the side of including too much and getting
some feedback from others on which bits are the essential ones to keep,
and which can be dropped.

> I think it's better to start with new library for tc/xdp and have
> libbpf as a dependency on that new lib.
> For example we can add it as subdir in tools/lib/bpf/.

I agree for the higher-level stuff (though I'm not sure what that would
be for TC), but right now TC programs are the only ones that cannot be
attached by libbpf, which is annoying; that's what we're trying to fix.

-Toke
Andrii Nakryiko March 28, 2021, 4:32 a.m. UTC | #3
On Fri, Mar 26, 2021 at 7:15 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Mar 25, 2021 at 05:30:03PM +0530, Kumar Kartikeya Dwivedi wrote:
> > This adds some basic tests for the low level bpf_tc_* API and its
> > bpf_program__attach_tc_* wrapper on top.
>
> *_block() apis from patch 3 and 4 are not covered by this selftest.
> Why were they added ? And how were they tested?
>
> Pls trim your cc. bpf@vger and netdev@vger would have been enough.
>
> My main concern with this set is that it adds netlink apis to libbpf while
> we already agreed to split xdp manipulation pieces out of libbpf.
> It would be odd to add tc apis now only to split them later.

We weren't going to split out basic attach APIs at all. So
bpf_set_link_xdp_fd() and bpf_program__attach_xdp() would stay in
libbpf. libxdp/libxsk would contain higher-level APIs which establish
additional conventions, beyond the basic operation of attaching BPF
program to XDP hook. E.g, all the chaining and how individual XDP
"sub-programs" are ordered, processed, updated/replaced, etc. That's
all based on one particular convention that libxdp would establish, so
that part shouldn't live in libbpf.

So in that sense, having TC attach APIs makes sense to complete
libbpf's APIs. I think it's totally in libbpf's domain to provide APIs
of the form "attach BPF program to BPF hook".

> I think it's better to start with new library for tc/xdp and have
> libbpf as a dependency on that new lib.
> For example we can add it as subdir in tools/lib/bpf/.
>
> Similarly I think integerating static linking into libbpf was a mistake.
> It should be a sub library as well.
>
> If we end up with core libbpf and ten sublibs for tc, xdp, af_xdp, linking,
> whatever else the users would appreciate that we don't shove single libbpf
> to them with a ton of features that they might never use.

What's the concern exactly? The size of the library? Having 10
micro-libraries has its own set of downsides, I'm not convinced that's
a better situation for end users. And would certainly cause more
hassle for libbpf developers and packagers.

And what did you include in "core libbpf"?
Alexei Starovoitov March 29, 2021, 1:26 a.m. UTC | #4
On Sat, Mar 27, 2021 at 04:17:16PM +0100, Toke Høiland-Jørgensen wrote:
> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
> 
> > On Thu, Mar 25, 2021 at 05:30:03PM +0530, Kumar Kartikeya Dwivedi wrote:
> >> This adds some basic tests for the low level bpf_tc_* API and its
> >> bpf_program__attach_tc_* wrapper on top.
> >
> > *_block() apis from patch 3 and 4 are not covered by this selftest.
> > Why were they added ? And how were they tested?
> >
> > Pls trim your cc. bpf@vger and netdev@vger would have been enough.
> >
> > My main concern with this set is that it adds netlink apis to libbpf while
> > we already agreed to split xdp manipulation pieces out of libbpf.
> > It would be odd to add tc apis now only to split them later.
> 
> We're not removing the ability to attach an XDP program via netlink from
> libxdp, though. This is the equivalent for TC: the minimum support to
> attach a program, and if you want to do more, you pull in another
> library or roll your own.
> 
> I'm fine with cutting out more stuff and making this even more minimal
> (e.g., remove the block stuff and only support attach/detach on ifaces),
> but we figured we'd err on the side of including too much and getting
> some feedback from others on which bits are the essential ones to keep,
> and which can be dropped.

This is up to you. I'm trying to understand the motivation for *_block() apis.
I'm not taking a stance for/against them.

> > I think it's better to start with new library for tc/xdp and have
> > libbpf as a dependency on that new lib.
> > For example we can add it as subdir in tools/lib/bpf/.
> 
> I agree for the higher-level stuff (though I'm not sure what that would
> be for TC), but right now TC programs are the only ones that cannot be
> attached by libbpf, which is annoying; that's what we're trying to fix.

Sure. I wasn't saying that there is no place for these APIs in libbpf+.
Just that existing libbpf is already became a kitchen sink of features
that users are not going to use like static linking.
tc-api was a straw that broke the camel's back.
I think we must move static linking and skeleton out of libbpf before
the next release.
Alexei Starovoitov March 29, 2021, 1:40 a.m. UTC | #5
On Sat, Mar 27, 2021 at 09:32:58PM -0700, Andrii Nakryiko wrote:
> > I think it's better to start with new library for tc/xdp and have
> > libbpf as a dependency on that new lib.
> > For example we can add it as subdir in tools/lib/bpf/.
> >
> > Similarly I think integerating static linking into libbpf was a mistake.
> > It should be a sub library as well.
> >
> > If we end up with core libbpf and ten sublibs for tc, xdp, af_xdp, linking,
> > whatever else the users would appreciate that we don't shove single libbpf
> > to them with a ton of features that they might never use.
> 
> What's the concern exactly? The size of the library? Having 10
> micro-libraries has its own set of downsides, 

specifically?

> I'm not convinced that's
> a better situation for end users. And would certainly cause more
> hassle for libbpf developers and packagers.

For developers and packagers.. yes.
For users.. quite the opposite.
The skel gen and static linking must be split out before the next libbpf release.
Not a single application linked with libbpf is going to use those pieces.
bpftool is one and only that needs them. Hence forcing libbpf users
to increase their .text with a dead code is a selfish call of libbpf
developers and packagers. The user's priorities must come first.

> And what did you include in "core libbpf"?

I would take this opportunity to split libbpf into maintainable pieces:
- libsysbpf - sys_bpf wrappers (pretty much tools/lib/bpf/bpf.c)
- libbpfutil - hash, strset
- libbtf - BTF read/write
- libbpfelf - ELF parsing, CORE, ksym, kconfig
- libbpfskel - skeleton gen used by bpftool only
- libbpflink - linker used by bpftool only
- libbpfnet - networking attachment via netlink including TC and XDP
- libbpftrace - perfbuf, ringbuf
- libxdp - Toke's xdp chaining
- libxsk - af_xdp logic

In the future the stack trace symbolization code can come
into libbpftrace or be a part of its own lib.
My upcoming loader program and signed prog generation logic
can be part of libbpfskel.
Kumar Kartikeya Dwivedi March 29, 2021, 1:45 a.m. UTC | #6
On Mon, Mar 29, 2021 at 06:56:02AM IST, Alexei Starovoitov wrote:
> This is up to you. I'm trying to understand the motivation for *_block() apis.
> I'm not taking a stance for/against them.

The block APIs simply attach to a different shared filter block, so in that
sense they just forward to the bpf_tc_cls_*_dev API internally, where parent_id
is substituted as block_index, and ifindex is set to a special value (to
indicate operation on a block), but is still a distinct attach point, and both
APIs cannot be mixed (i.e. manipulation of filter attached using block API is
not possible using dev API).

e.g.

# tc qdisc add dev <foo> ingress block 1
# tc qdisc add dev <bar> ingress block 1

Now you can attach a filter to the shared block, e.g.

# tc filter add block 1 bpf /home/kkd/foo.o sec cls direct-action

and it will attach the identical filter with the bpf prog classifier to both
qdiscs in one go, instead of having to duplicate filter creation for each qdisc.
You can add arbitrarily many qdiscs to such a filter block, easing filter
management, and saving on resources.

So for the API, it made sense to separate this into its own function as it is a
different attach point, both for the low level API and their higher level
wrappers. This does increase the symbol count, but maintenance wise it is
zero-cost since it simply forwards to the dev functions.

As for the tests, I'll add them for the block API in v2, when I get around to
sending it (i.e. after the review is over).

> [...]

--
Kartikeya
Andrii Nakryiko March 29, 2021, 2:38 a.m. UTC | #7
On Sun, Mar 28, 2021 at 6:40 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Sat, Mar 27, 2021 at 09:32:58PM -0700, Andrii Nakryiko wrote:
> > > I think it's better to start with new library for tc/xdp and have
> > > libbpf as a dependency on that new lib.
> > > For example we can add it as subdir in tools/lib/bpf/.
> > >
> > > Similarly I think integerating static linking into libbpf was a mistake.
> > > It should be a sub library as well.
> > >
> > > If we end up with core libbpf and ten sublibs for tc, xdp, af_xdp, linking,
> > > whatever else the users would appreciate that we don't shove single libbpf
> > > to them with a ton of features that they might never use.
> >
> > What's the concern exactly? The size of the library? Having 10
> > micro-libraries has its own set of downsides,
>
> specifically?

You didn't answer my question, but from what you write below I assume
libbpf size is your main concern?

As for downsides, I'm sure I'm not yet seeing all of the problems
we'll encounter when splitting libbpf into 10 pieces. But as a user,
having to figure out which libraries I need to use is a big hassle.
E.g., for XDP application using ringbuf, I'll need libbpfelf,
libbpftrace, libbpfnet, which implicitly also would depend on
libsysbpf, libbtf, libbpfutil, I assume. So having to list 3 vs 1
library is already annoying, but when statically linking I'd need to
specify all 6. I'd very much rather know that it has to be -lbpf at it
will provide me with all the basics (and it's already -lz and -lelf in
static linking scenario, which I wish we could get rid of).

>
> > I'm not convinced that's
> > a better situation for end users. And would certainly cause more
> > hassle for libbpf developers and packagers.
>
> For developers and packagers.. yes.
> For users.. quite the opposite.

See above. I don't know which hassle is libbpf for users today. You
were implying code size used for functionality users might not use
(e.g., linker). Libbpf is a very small library, <300KB. There are
users building tools for constrained embedded systems that use libbpf.
There are/were various problems mentioned, but the size of libbpf
wasn't yet one of them. We should certainly watch the code bloat, but
we are not yet at the point where library is too big for users to be
turned off. In shared library case it's even less of a concern.

> The skel gen and static linking must be split out before the next libbpf release.
> Not a single application linked with libbpf is going to use those pieces.
> bpftool is one and only that needs them. Hence forcing libbpf users
> to increase their .text with a dead code is a selfish call of libbpf
> developers and packagers. The user's priorities must come first.
>
> > And what did you include in "core libbpf"?
>
> I would take this opportunity to split libbpf into maintainable pieces:
> - libsysbpf - sys_bpf wrappers (pretty much tools/lib/bpf/bpf.c)
> - libbpfutil - hash, strset

strset and hash are internal data structures, I never intended to
expose them through public APIs. I haven't investigated, but if we
have a separate shared library (libbpfutil), I imagine we won't be
able to hide those APIs, right?

> - libbtf - BTF read/write
> - libbpfelf - ELF parsing, CORE, ksym, kconfig
> - libbpfskel - skeleton gen used by bpftool only

skeleton generation is already part of bpftool, there is no need to
split anything out

> - libbpflink - linker used by bpftool only
> - libbpfnet - networking attachment via netlink including TC and XDP
> - libbpftrace - perfbuf, ringbuf

ringbuf and perfbuf are both very small code-wise, and are used in
majority of BPF applications anyways

> - libxdp - Toke's xdp chaining
> - libxsk - af_xdp logic
>

Now, if we look at libbpf .o files, we can approximately see what
functionality is using most code:

File                Size Percent

bpf.o              17800    4.88
bpf_prog_linfo.o    2952    0.81
btf_dump.o         20472    5.61
btf.o              58160   15.93
hashmap.o           4056    1.11
libbpf_errno.o      2912    0.80
libbpf.o          190072   52.06
libbpf_probes.o     6696    1.83
linker.o           29408    8.05
netlink.o           5944    1.63
nlattr.o            2744    0.75
ringbuf.o           6128    1.68
str_error.o         1640    0.45
strset.o            3656    1.00
xsk.o              12456    3.41

Total             365096  100.00

so libbpf.o which has mostly bpf_object open/load logic and CO-RE take
more than half already. And it depends on still more stuff in btf,
hashmap, bpf, libbpf_probes, errno. But the final code size is even
smaller, because libbpf.so is just 285128 bytes (not 365096 as implied
by the table above), so even these numbers are pessimistic.

linker.o, which is about 8% of the code right now, but is also
actually taking less than 29KB, because when I remove linker.o and
re-compile, the final libbpf.so goes from 285128 to 267576 = 17552
reduction. Even if it grows 2x, I'd still say it's not a big deal.

One reason to keep BPF linker in libbpf is that it is not only bpftool
that would be using it. Our libbpf Rust bindings
 is implementing its own BPF skeleton generation, and we'd like to use
linker APIs to support static linking when using libbpf-rs without
depending on bpftool. So having it in libbpf and not in bpftool is
good when you consider the wider ecosystem.

But again, let's just reflect for a second that we are talking about
the library that takes less than 300KB total. It would be also
interesting to experiment with LTO and its effect on final binaries
when statically linking against libbpf. I haven't tried yet, though.


> In the future the stack trace symbolization code can come
> into libbpftrace or be a part of its own lib.
> My upcoming loader program and signed prog generation logic
> can be part of libbpfskel.
Toke Høiland-Jørgensen March 29, 2021, 9:56 a.m. UTC | #8
Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Sat, Mar 27, 2021 at 09:32:58PM -0700, Andrii Nakryiko wrote:
>> > I think it's better to start with new library for tc/xdp and have
>> > libbpf as a dependency on that new lib.
>> > For example we can add it as subdir in tools/lib/bpf/.
>> >
>> > Similarly I think integerating static linking into libbpf was a mistake.
>> > It should be a sub library as well.
>> >
>> > If we end up with core libbpf and ten sublibs for tc, xdp, af_xdp, linking,
>> > whatever else the users would appreciate that we don't shove single libbpf
>> > to them with a ton of features that they might never use.
>> 
>> What's the concern exactly? The size of the library? Having 10
>> micro-libraries has its own set of downsides, 
>
> specifically?
>
>> I'm not convinced that's
>> a better situation for end users. And would certainly cause more
>> hassle for libbpf developers and packagers.
>
> For developers and packagers.. yes.
> For users.. quite the opposite.
> The skel gen and static linking must be split out before the next libbpf release.
> Not a single application linked with libbpf is going to use those pieces.

I'd tend to agree about the skeleton generation, but I have one use case
in mind where having the linker in library form would be handy:
dynamically building an XDP program at load time from pre-compiled
pieces.

Consider xdp-filter[0]: it's a simplistic packet filter that can filter
on different bits of the packet header, mostly meant as a demonstration
of XDP packet filtering performance. It's also using conditional
compilation so that it can be loaded in a mode that skips parsing L4
headers entirely if port-based filtering is not enabled. Right now we do
that by pre-compiling five different variants of the XDP program and
loading based on the selected feature set, but with linking in libbpf,
we could instead have a single BPF program with granular filtering
functions and just assemble the final program from those bits at load
time.

The actual xdp-filter program may be too simplistic to gain any
performance for this, but I believe the general approach could be a way
to realise the "improved performance through skipping code" promise of
an XDP-based data path. Having linking be part of libbpf will make this
straight-forward to integrate into applications.

[0] https://github.com/xdp-project/xdp-tools/tree/master/xdp-filter

> bpftool is one and only that needs them. Hence forcing libbpf users
> to increase their .text with a dead code is a selfish call of libbpf
> developers and packagers. The user's priorities must come first.
>
>> And what did you include in "core libbpf"?
>
> I would take this opportunity to split libbpf into maintainable pieces:
> - libsysbpf - sys_bpf wrappers (pretty much tools/lib/bpf/bpf.c)
> - libbpfutil - hash, strset
> - libbtf - BTF read/write
> - libbpfelf - ELF parsing, CORE, ksym, kconfig
> - libbpfskel - skeleton gen used by bpftool only
> - libbpflink - linker used by bpftool only
> - libbpfnet - networking attachment via netlink including TC and XDP
> - libbpftrace - perfbuf, ringbuf
> - libxdp - Toke's xdp chaining
> - libxsk - af_xdp logic

Huh? You've got to be joking? How is that going to improve things for
users? Just the cognitive load of figuring out which linker flags to use
is going to be prohibitive. Not to mention the hassle of keeping
multiple library versions in sync etc.

If the concern is .text size, surely there are better ways to fix that?
LTO is the obvious "automagic" solution, but even without that, just
supporting conditional compilation via defines in the existing libbpf
ought to achieve the same thing without exposing the gory details to the
users?

-Toke
Alexei Starovoitov March 30, 2021, 3:28 a.m. UTC | #9
On Sun, Mar 28, 2021 at 07:38:42PM -0700, Andrii Nakryiko wrote:
> 
> See above. I don't know which hassle is libbpf for users today. You
> were implying code size used for functionality users might not use
> (e.g., linker). Libbpf is a very small library, <300KB. There are
> users building tools for constrained embedded systems that use libbpf.
> There are/were various problems mentioned, but the size of libbpf
> wasn't yet one of them. We should certainly watch the code bloat, but
> we are not yet at the point where library is too big for users to be
> turned off. 

It's true that today sizeof(libbpf + libelf + libz) ~ 500k is not a concern.
I'm worried what it will get to if we don't start splitting things up.

Why split libxdp into its own lib?
If tc attach is going to part of libbpf all things xdp should be
part of libbpf as well.

But af_xdp folks are probably annoyed that they need to add -lelf an -lz
though they're not using them. Just a tech debt that eventually need to be paid.

> > I would take this opportunity to split libbpf into maintainable pieces:
> > - libsysbpf - sys_bpf wrappers (pretty much tools/lib/bpf/bpf.c)
> > - libbpfutil - hash, strset
> 
> strset and hash are internal data structures, I never intended to
> expose them through public APIs. I haven't investigated, but if we
> have a separate shared library (libbpfutil), I imagine we won't be
> able to hide those APIs, right?

In the other thread you've proposed to copy paste hash implemenation
into pahole. That's not ideal. If we had libbpfutil other projects
could have used that without copy-paste.

> But again, let's just reflect for a second that we are talking about
> the library that takes less than 300KB total. 

that's today. Plus mandatory libelf and libz.
I would like to have libsysbpf that doesn't depend on libelf/libz
for folks that don't need it.
Also I'd like to see symbolizer to be included in "libbpf package".
Currently it's the main component that libbcc offers, but libbpf doesn't.
Say we don't split libbpf. Then symbolizer will bring some dwarf library
(say libdwarves ~ 1Mbyte) and libiberty ~ 500k (for c++ demangle).
Now we're looking at multi megabyte libbpf package.
I think the users would benefit from smaller building blocks.
Splitting into 10 mini libs is overkill, of course,
but some split is necessary.
I agree that moving static linking into separate lib won't really
affect .text size. The point is not to reduce text, but to establish
a framework where such things are possible. Then symbolizer and
anything fancier that would depend on other libs can be part
of "libbpf package". I mean single rpm that contains all libbpf libs.
Basic libsysbpf wouldn't need libelf/z.
libbpfsymbolizer would need libdwarf, etc.
Maybe some libbpfnet would depend on libnl or what not.
Andrii Nakryiko March 30, 2021, 8:28 p.m. UTC | #10
On Mon, Mar 29, 2021 at 8:28 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Sun, Mar 28, 2021 at 07:38:42PM -0700, Andrii Nakryiko wrote:
> >
> > See above. I don't know which hassle is libbpf for users today. You
> > were implying code size used for functionality users might not use
> > (e.g., linker). Libbpf is a very small library, <300KB. There are
> > users building tools for constrained embedded systems that use libbpf.
> > There are/were various problems mentioned, but the size of libbpf
> > wasn't yet one of them. We should certainly watch the code bloat, but
> > we are not yet at the point where library is too big for users to be
> > turned off.
>
> It's true that today sizeof(libbpf + libelf + libz) ~ 500k is not a concern.
> I'm worried what it will get to if we don't start splitting things up.

I'd say let's cross that bridge when we get there. We might never have
to even worry about that because libbpf won't grow in size that much.

>
> Why split libxdp into its own lib?

Because libxdp is establishing *a way* to perform XDP chaining and all
the setup around that. If it was the only right way to do this (and it
was clear it is the only way), then we might have declared that as a
solved problem worth providing with libbpf out of the box. I don't
think even Toke would claim that his approach is the only possible and
clearly superior to anything else. I think it's too nuanced and
complicated problem to have the solution.

> If tc attach is going to part of libbpf all things xdp should be
> part of libbpf as well.

I'm not TC expert, but it seems to be conceptually equivalent to basic
"attach to cgroup" or "attach XDP to interface" or "attach to
tracepoint" API, so seems in line with what libbpf is trying to
provide. If someone would want to construct higher-level concept on
top of that (e.g., some chaining of TC programs or whatnot), then it
would be out of scope for libbpf.

>
> But af_xdp folks are probably annoyed that they need to add -lelf an -lz
> though they're not using them. Just a tech debt that eventually need to be paid.

Those are dependencies of libbpf. Unless we want to re-implement ELF
handling code, we can't get rid of -lelf. I don't consider that a tech
debt at all. As for -lz, it's used for processing /proc/kconfig.gz
(for __kconfig externs). We can do dynamic libz.so loading only when
__kconfig externs are used, if you think that's a big problem. But
libz is such a widely available library, that no one complained so
far. Yes, I'm annoyed by having to specify -lelf and -lz as well, but
that's how C linking work, so I can't do much about that. Even more,
the order matters as well!

And just in the last email you were proposing to add 10 more
-l<libbpfsomething> and were wondering what's the downside, so I'm
confused about the direction of this discussion.

>
> > > I would take this opportunity to split libbpf into maintainable pieces:
> > > - libsysbpf - sys_bpf wrappers (pretty much tools/lib/bpf/bpf.c)
> > > - libbpfutil - hash, strset
> >
> > strset and hash are internal data structures, I never intended to
> > expose them through public APIs. I haven't investigated, but if we
> > have a separate shared library (libbpfutil), I imagine we won't be
> > able to hide those APIs, right?
>
> In the other thread you've proposed to copy paste hash implemenation
> into pahole. That's not ideal. If we had libbpfutil other projects
> could have used that without copy-paste.

I know it's not ideal. But I don't think libbpf should be in the
business of providing generic data structures with stable APIs either.
We are stuck with C, unfortunately, so we have to live with its
deficiencies.

>
> > But again, let's just reflect for a second that we are talking about
> > the library that takes less than 300KB total.
>
> that's today. Plus mandatory libelf and libz.
> I would like to have libsysbpf that doesn't depend on libelf/libz
> for folks that don't need it.

TBH, bpf.c is such a minimal shim on top of bpf() syscall, that
providing all of its implementation as a single .h wouldn't be too
horrible. Then whatever applications want those syscall wrappers would
just include bpf/bpf.h and have no need for the library at all.

> Also I'd like to see symbolizer to be included in "libbpf package".
> Currently it's the main component that libbcc offers, but libbpf doesn't.
> Say we don't split libbpf. Then symbolizer will bring some dwarf library
> (say libdwarves ~ 1Mbyte) and libiberty ~ 500k (for c++ demangle).
> Now we're looking at multi megabyte libbpf package.

Right, which is one of the reasons why it probably doesn't belong in
libbpf at all. Another is that it's not BPF-specific functionality at
all.

> I think the users would benefit from smaller building blocks.
> Splitting into 10 mini libs is overkill, of course,
> but some split is necessary.
> I agree that moving static linking into separate lib won't really
> affect .text size. The point is not to reduce text, but to establish
> a framework where such things are possible. Then symbolizer and
> anything fancier that would depend on other libs can be part
> of "libbpf package". I mean single rpm that contains all libbpf libs.
> Basic libsysbpf wouldn't need libelf/z.
> libbpfsymbolizer would need libdwarf, etc.
> Maybe some libbpfnet would depend on libnl or what not.

I'm against pro-active splitting just in case. I'd rather discuss
specific problems when we get to them. I think it's premature right
now to split libbpf.
Alexei Starovoitov March 30, 2021, 11:27 p.m. UTC | #11
On Tue, Mar 30, 2021 at 1:28 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
> >
> > In the other thread you've proposed to copy paste hash implemenation
> > into pahole. That's not ideal. If we had libbpfutil other projects
> > could have used that without copy-paste.
>
> I know it's not ideal. But I don't think libbpf should be in the
> business of providing generic data structures with stable APIs either.

There is a need for hash in pahole and it's already using libbpf.
Would be good to reuse the code.

> > that's today. Plus mandatory libelf and libz.
> > I would like to have libsysbpf that doesn't depend on libelf/libz
> > for folks that don't need it.
>
> TBH, bpf.c is such a minimal shim on top of bpf() syscall, that
> providing all of its implementation as a single .h wouldn't be too
> horrible. Then whatever applications want those syscall wrappers would
> just include bpf/bpf.h and have no need for the library at all.

1k line bpf.h. hmm. That's not going to be a conventional C header,
but it could work I guess.

> > Also I'd like to see symbolizer to be included in "libbpf package".
> > Currently it's the main component that libbcc offers, but libbpf doesn't.
> > Say we don't split libbpf. Then symbolizer will bring some dwarf library
> > (say libdwarves ~ 1Mbyte) and libiberty ~ 500k (for c++ demangle).
> > Now we're looking at multi megabyte libbpf package.
>
> Right, which is one of the reasons why it probably doesn't belong in
> libbpf at all. Another is that it's not BPF-specific functionality at
> all.

symbolizer, usdt, python and lua bindings is what made libbcc successful.
I think "libbpf package" should include everything that bpf tracing folks
might need.
Getting -l flags correct from a single package isn't a big deal
compared with the need to deal with different packages that
depend on each other.

> I'm against pro-active splitting just in case. I'd rather discuss
> specific problems when we get to them. I think it's premature right
> now to split libbpf.

Fine.
I'm mainly advocating to change the mental model to see
libbpf as a collection of tools and libraries and not just single libbpf.a
diff mbox series

Patch

diff --git a/tools/testing/selftests/bpf/prog_tests/test_tc_bpf.c b/tools/testing/selftests/bpf/prog_tests/test_tc_bpf.c
new file mode 100644
index 000000000000..8bab56b4dea0
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/test_tc_bpf.c
@@ -0,0 +1,261 @@ 
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/bpf.h>
+#include <linux/err.h>
+#include <bpf/libbpf.h>
+#include <errno.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <test_progs.h>
+#include <linux/if_ether.h>
+
+#define LO_IFINDEX 1
+
+static int test_tc_cls_internal(int fd, __u32 parent_id)
+{
+	struct bpf_tc_cls_attach_id id = {};
+	struct bpf_tc_cls_info info = {};
+	int ret;
+	DECLARE_LIBBPF_OPTS(bpf_tc_cls_opts, opts, .handle = 1, .priority = 10,
+			    .class_id = TC_H_MAKE(1UL << 16, 1),
+			    .chain_index = 5);
+
+	ret = bpf_tc_cls_attach_dev(fd, LO_IFINDEX, parent_id, ETH_P_IP, &opts,
+				    &id);
+	if (CHECK_FAIL(ret < 0))
+		return ret;
+
+	ret = bpf_tc_cls_get_info_dev(fd, LO_IFINDEX, parent_id, ETH_P_IP, NULL,
+				      &info);
+	if (CHECK_FAIL(ret < 0))
+		goto end;
+
+	ret = -1;
+
+	if (CHECK_FAIL(info.id.ifindex != id.ifindex) ||
+	    CHECK_FAIL(info.id.parent_id != id.parent_id) ||
+	    CHECK_FAIL(info.id.handle != id.handle) ||
+	    CHECK_FAIL(info.id.protocol != id.protocol) ||
+	    CHECK_FAIL(info.id.chain_index != id.chain_index) ||
+	    CHECK_FAIL(info.id.priority != id.priority) ||
+	    CHECK_FAIL(info.id.ifindex != LO_IFINDEX) ||
+	    CHECK_FAIL(info.id.parent_id != parent_id) ||
+	    CHECK_FAIL(info.id.handle != 1) ||
+	    CHECK_FAIL(info.id.priority != 10) ||
+	    CHECK_FAIL(info.id.protocol != ETH_P_IP) ||
+	    CHECK_FAIL(info.class_id != TC_H_MAKE(1UL << 16, 1)) ||
+	    CHECK_FAIL(info.id.chain_index != 5))
+		goto end;
+
+	opts.direct_action = true;
+	ret = bpf_tc_cls_replace_dev(fd, id.ifindex, id.parent_id, id.protocol,
+				     &opts, &id);
+	if (CHECK_FAIL(ret < 0))
+		return ret;
+
+end:;
+	ret = bpf_tc_cls_detach_dev(&id);
+	CHECK_FAIL(ret < 0);
+	return ret;
+}
+
+static int test_tc_cls(struct bpf_program *prog, __u32 parent_id)
+{
+	struct bpf_tc_cls_info info = {};
+	struct bpf_link *link;
+	int ret;
+	DECLARE_LIBBPF_OPTS(bpf_tc_cls_opts, opts, .priority = 10, .handle = 1,
+			    .class_id = TC_H_MAKE(1UL << 16, 1));
+
+	link = bpf_program__attach_tc_cls_dev(prog, LO_IFINDEX, parent_id,
+					      ETH_P_ALL, &opts);
+	if (CHECK_FAIL(IS_ERR_OR_NULL(link)))
+		return PTR_ERR(link);
+
+	ret = bpf_tc_cls_get_info_dev(bpf_program__fd(prog), LO_IFINDEX,
+				      parent_id, ETH_P_ALL, NULL, &info);
+	if (CHECK_FAIL(ret < 0))
+		goto end;
+
+	ret = -1;
+
+	if (CHECK_FAIL(info.id.ifindex != LO_IFINDEX) ||
+	    CHECK_FAIL(info.id.handle != 1) ||
+	    CHECK_FAIL(info.id.priority != 10) ||
+	    CHECK_FAIL(info.id.protocol != ETH_P_ALL) ||
+	    CHECK_FAIL(info.class_id != TC_H_MAKE(1UL << 16, 1)))
+		goto end;
+
+	/* Demonstrate changing attributes (e.g. to direct action) */
+	opts.class_id = TC_H_MAKE(1UL << 16, 2);
+	opts.direct_action = true;
+
+	/* Disconnect as we drop to the lower level API, which invalidates the
+	 * link.
+	 */
+	bpf_link__disconnect(link);
+
+	ret = bpf_tc_cls_change_dev(bpf_program__fd(prog), info.id.ifindex,
+				    info.id.parent_id, info.id.protocol, &opts,
+				    &info.id);
+	if (CHECK_FAIL(ret < 0))
+		goto end;
+
+	ret = bpf_tc_cls_get_info_dev(bpf_program__fd(prog), info.id.ifindex,
+				      info.id.parent_id, info.id.protocol, NULL,
+				      &info);
+	if (CHECK_FAIL(ret < 0))
+		goto end;
+
+	ret = -1;
+
+	if (CHECK_FAIL(info.class_id != TC_H_MAKE(1UL << 16, 2)))
+		goto end;
+	if (CHECK_FAIL((info.bpf_flags & TCA_BPF_FLAG_ACT_DIRECT) != 1))
+		goto end;
+
+	ret = bpf_tc_cls_detach_dev(&info.id);
+	if (CHECK_FAIL(ret < 0))
+		goto end;
+
+end:
+	ret = bpf_link__destroy(link);
+	CHECK_FAIL(ret < 0);
+	return ret;
+}
+
+static int test_tc_act_internal(int fd)
+{
+	struct bpf_tc_act_info info = {};
+	__u32 index = 0;
+	int ret;
+	DECLARE_LIBBPF_OPTS(bpf_tc_act_opts, opts, 0);
+
+	ret = bpf_tc_act_attach(fd, &opts, &index);
+	if (CHECK_FAIL(ret < 0 || !index))
+		goto end;
+
+	index = 0;
+	ret = bpf_tc_act_attach(fd, &opts, &index);
+	if (CHECK_FAIL(ret < 0 || !index))
+		goto end;
+
+	opts.index = 3;
+	index = 0;
+	ret = bpf_tc_act_attach(fd, &opts, &index);
+	if (CHECK_FAIL(ret < 0 || !index))
+		goto end;
+
+	index = 0;
+	ret = bpf_tc_act_replace(fd, &opts, &index);
+	if (CHECK_FAIL(ret < 0 || !index))
+		goto end;
+
+	opts.index = 1;
+	ret = bpf_tc_act_attach(fd, &opts, &index);
+	if (CHECK_FAIL(!ret || ret != -EEXIST)) {
+		ret = -1;
+		goto end;
+	}
+
+	for (int i = 0; i < 3; i++) {
+		memset(&info, 0, sizeof(info));
+
+		ret = bpf_tc_act_get_info(fd, &info);
+		if (CHECK_FAIL(ret < 0 && ret != -ESRCH))
+			goto end;
+
+		if (CHECK_FAIL(ret == -ESRCH))
+			goto end;
+
+		if (CHECK_FAIL(info.refcnt != 1))
+			goto end;
+
+		ret = bpf_tc_act_detach(info.index);
+		if (CHECK_FAIL(ret < 0))
+			goto end;
+	}
+
+	CHECK_FAIL(bpf_tc_act_get_info(fd, &info) == -ESRCH);
+
+end:
+	ret = bpf_tc_act_detach(0);
+	CHECK_FAIL(ret < 0);
+	return ret;
+}
+
+static int test_tc_act(struct bpf_program *prog)
+{
+	struct bpf_tc_act_info info = {};
+	struct bpf_link *link;
+	int ret;
+	DECLARE_LIBBPF_OPTS(bpf_tc_act_opts, opts, .index = 42);
+
+	link = bpf_program__attach_tc_act(prog, &opts);
+	if (CHECK_FAIL(IS_ERR_OR_NULL(link)))
+		return PTR_ERR(link);
+
+	ret = bpf_tc_act_get_info(bpf_program__fd(prog), &info);
+	if (CHECK_FAIL(ret < 0))
+		goto end;
+
+	if (CHECK_FAIL(info.index != 42))
+		goto end;
+
+end:
+	ret = bpf_link__destroy(link);
+	CHECK_FAIL(ret < 0);
+	return ret;
+}
+
+void test_test_tc_bpf(void)
+{
+	const char *file = "./test_tc_bpf_kern.o";
+	int cls_fd, act_fd, ret;
+	struct bpf_program *clsp, *actp;
+	struct bpf_object *obj;
+
+	obj = bpf_object__open(file);
+	if (CHECK_FAIL(IS_ERR_OR_NULL(obj)))
+		return;
+
+	clsp = bpf_object__find_program_by_title(obj, "classifier");
+	if (CHECK_FAIL(IS_ERR_OR_NULL(clsp)))
+		goto end;
+
+	actp = bpf_object__find_program_by_title(obj, "action");
+	if (CHECK_FAIL(IS_ERR_OR_NULL(clsp)))
+		goto end;
+
+	ret = bpf_object__load(obj);
+	if (CHECK_FAIL(ret < 0))
+		goto end;
+
+	cls_fd = bpf_program__fd(clsp);
+	act_fd = bpf_program__fd(actp);
+
+	if (CHECK_FAIL(system("tc qdisc add dev lo clsact")))
+		goto end;
+
+	ret = test_tc_cls_internal(cls_fd, BPF_TC_CLSACT_INGRESS);
+	if (CHECK_FAIL(ret < 0))
+		goto end;
+
+	ret = test_tc_cls(clsp, BPF_TC_CLSACT_EGRESS);
+	if (CHECK_FAIL(ret < 0))
+		goto end;
+
+	system("tc qdisc del dev lo clsact");
+
+	ret = test_tc_act_internal(act_fd);
+	if (CHECK_FAIL(ret < 0))
+		goto end;
+
+	ret = test_tc_act(actp);
+	if (CHECK_FAIL(ret < 0))
+		goto end;
+
+end:
+	bpf_object__close(obj);
+	return;
+}
diff --git a/tools/testing/selftests/bpf/progs/test_tc_bpf_kern.c b/tools/testing/selftests/bpf/progs/test_tc_bpf_kern.c
new file mode 100644
index 000000000000..d39644ea0fd7
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_tc_bpf_kern.c
@@ -0,0 +1,18 @@ 
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+
+// Dummy prog to test tc_bpf API
+
+SEC("classifier")
+int cls(struct __sk_buff *skb)
+{
+	return 0;
+}
+
+SEC("action")
+int act(struct __sk_buff *skb)
+{
+	return 0;
+}