mbox series

[bpf-next,00/11] bpfilter

Message ID 20210517225308.720677-1-me@ubique.spb.ru (mailing list archive)
Headers show
Series bpfilter | expand

Message

Dmitrii Banshchikov May 17, 2021, 10:52 p.m. UTC
The patchset is based on the patches from David S. Miller [1] and Daniel
Borkmann [2].

The main goal of the patchset is to prepare bpfilter for iptables'
configuration blob parsing and code generation.

The patchset introduces data structures and code for matches, targets, rules
and tables.

It seems inconvenient to continue to use the same blob internally in bpfilter
in parts other than the blob parsing. That is why a superstructure with native
types is introduced. It provides a more convenient way to iterate over the blob
and limit the crazy structs widespread in the bpfilter code.

In this patchset version existing blob's correctness checking that is done by
the iptables kernel part is not reproduced. It will be added in the next
iteration.

Also the current version misses handling of counters. Postpone its
implementation until the code generation phase as it's not clear yet how to
better handle them.

The rough plan for the code generation.

It seems reasonable to assume that the first rules should cover most of the
packet flow.  This is why they are critical from the performance point of view.
At the same time a number of user defined rules might be pretty large. Also
there is a limit on size and complexity of a BPF program introduced by the
verifier.

There are two approaches how to handle iptables' rules in generated BPF programs.

The first approach is to generate a BPF program that is an equivalent to a set
of rules on a rule by rule basis. This approach should give the best
performance. The drawback is the limitation from the verifier on size and
complexity of BPF program.

The second approach is to use an internal representation of rules stored in a
BPF map and use bpf_for_each_map_elem() helper to iterate over them. In this case
the helper's callback is a BPF function that is able to process any valid
rule.

Combination of the two approaches should give most of the benefits - a
heuristic should help to select a small subset of the rules for code generation
on a rule by rule basis. All other rules are cold and it should be possible to
store them in an internal form in a BPF map. The rules will be handled by
bpf_for_each_map_elem().  This should remove the limit on the number of
supported rules.

During development it was useful to use statically linked sanitizers in
bpfilter usermode helper. Also it is possible to use fuzzers but it's not clear
if it is worth adding them to the test infrastructure - because there are no
other fuzzers under tools/testing/selftests currently.

Patch 1 adds definitions of the used types.
Patch 2 adds logging facility to bpfilter.
Patch 3 adds IO functions.
Patch 4 adds bpfilter header to tools
Patch 5 adds an associative map.
Patches 6/7/8/9 adds code for matches, targets, rules and table.
Patch 10 handles hooked setsockopt(2) calls.
Patch 11 uses prepared code in main().

Here is the example:

# dmesg  | tail -n 2
[   23.636102] bpfilter: Loaded bpfilter_umh pid 181
[   23.658529] bpfilter: started
# /usr/sbin/iptables-legacy -L -n
Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
# /usr/sbin/iptables-legacy -A INPUT -p udp --dport 23 -j DROP
# /usr/sbin/iptables-legacy -L -n
Chain INPUT (policy ACCEPT)
target     prot opt source               destination
DROP       udp  --  0.0.0.0/0            0.0.0.0/0           udp dpt:23

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
# /usr/sbin/iptables-legacy -F
# /usr/sbin/iptables-legacy -L -n
Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
#


1. https://lore.kernel.org/patchwork/patch/902785/
2. https://lore.kernel.org/patchwork/patch/902783/

Dmitrii Banshchikov (11):
  bpfilter: Add types for usermode helper
  bpfilter: Add logging facility
  bpfilter: Add IO functions
  tools: Add bpfilter usermode helper header
  bpfilter: Add map container
  bpfilter: Add struct match
  bpfilter: Add struct target
  bpfilter: Add struct rule
  bpfilter: Add struct table
  bpfilter: Add handling of setsockopt() calls
  bpfilter: Handle setsockopts

 include/uapi/linux/bpfilter.h                 | 155 ++++++++
 net/bpfilter/Makefile                         |   3 +-
 net/bpfilter/bflog.c                          |  29 ++
 net/bpfilter/bflog.h                          |  24 ++
 net/bpfilter/context.c                        | 176 +++++++++
 net/bpfilter/context.h                        |  27 ++
 net/bpfilter/io.c                             |  77 ++++
 net/bpfilter/io.h                             |  18 +
 net/bpfilter/main.c                           |  99 ++---
 net/bpfilter/map-common.c                     |  64 ++++
 net/bpfilter/map-common.h                     |  19 +
 net/bpfilter/match-ops-map.h                  |  48 +++
 net/bpfilter/match.c                          |  73 ++++
 net/bpfilter/match.h                          |  34 ++
 net/bpfilter/rule.c                           | 128 +++++++
 net/bpfilter/rule.h                           |  27 ++
 net/bpfilter/sockopt.c                        | 357 ++++++++++++++++++
 net/bpfilter/sockopt.h                        |  14 +
 net/bpfilter/table-map.h                      |  41 ++
 net/bpfilter/table.c                          | 167 ++++++++
 net/bpfilter/table.h                          |  33 ++
 net/bpfilter/target-ops-map.h                 |  49 +++
 net/bpfilter/target.c                         | 112 ++++++
 net/bpfilter/target.h                         |  34 ++
 tools/include/uapi/linux/bpfilter.h           | 179 +++++++++
 .../testing/selftests/bpf/bpfilter/.gitignore |   6 +
 tools/testing/selftests/bpf/bpfilter/Makefile |  31 ++
 .../selftests/bpf/bpfilter/bpfilter_util.h    |  39 ++
 .../testing/selftests/bpf/bpfilter/test_io.c  | 100 +++++
 .../testing/selftests/bpf/bpfilter/test_map.c |  63 ++++
 .../selftests/bpf/bpfilter/test_match.c       |  63 ++++
 .../selftests/bpf/bpfilter/test_rule.c        |  55 +++
 .../selftests/bpf/bpfilter/test_target.c      |  85 +++++
 33 files changed, 2382 insertions(+), 47 deletions(-)
 create mode 100644 net/bpfilter/bflog.c
 create mode 100644 net/bpfilter/bflog.h
 create mode 100644 net/bpfilter/context.c
 create mode 100644 net/bpfilter/context.h
 create mode 100644 net/bpfilter/io.c
 create mode 100644 net/bpfilter/io.h
 create mode 100644 net/bpfilter/map-common.c
 create mode 100644 net/bpfilter/map-common.h
 create mode 100644 net/bpfilter/match-ops-map.h
 create mode 100644 net/bpfilter/match.c
 create mode 100644 net/bpfilter/match.h
 create mode 100644 net/bpfilter/rule.c
 create mode 100644 net/bpfilter/rule.h
 create mode 100644 net/bpfilter/sockopt.c
 create mode 100644 net/bpfilter/sockopt.h
 create mode 100644 net/bpfilter/table-map.h
 create mode 100644 net/bpfilter/table.c
 create mode 100644 net/bpfilter/table.h
 create mode 100644 net/bpfilter/target-ops-map.h
 create mode 100644 net/bpfilter/target.c
 create mode 100644 net/bpfilter/target.h
 create mode 100644 tools/include/uapi/linux/bpfilter.h
 create mode 100644 tools/testing/selftests/bpf/bpfilter/.gitignore
 create mode 100644 tools/testing/selftests/bpf/bpfilter/Makefile
 create mode 100644 tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
 create mode 100644 tools/testing/selftests/bpf/bpfilter/test_io.c
 create mode 100644 tools/testing/selftests/bpf/bpfilter/test_map.c
 create mode 100644 tools/testing/selftests/bpf/bpfilter/test_match.c
 create mode 100644 tools/testing/selftests/bpf/bpfilter/test_rule.c
 create mode 100644 tools/testing/selftests/bpf/bpfilter/test_target.c

Comments

Song Liu May 20, 2021, 4:54 a.m. UTC | #1
> On May 17, 2021, at 3:52 PM, Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
> 
> The patchset is based on the patches from David S. Miller [1] and Daniel
> Borkmann [2].
> 
> The main goal of the patchset is to prepare bpfilter for iptables'
> configuration blob parsing and code generation.
> 
> The patchset introduces data structures and code for matches, targets, rules
> and tables.
> 
> It seems inconvenient to continue to use the same blob internally in bpfilter
> in parts other than the blob parsing. That is why a superstructure with native
> types is introduced. It provides a more convenient way to iterate over the blob
> and limit the crazy structs widespread in the bpfilter code.
> 

[...]

> 
> 
> 1. https://lore.kernel.org/patchwork/patch/902785/

[1] used bpfilter_ prefix on struct definitions, like "struct bpfilter_target"
I think we should do the same in this version. (Or were there discussions on
removing the prefix?). 

Thanks,
Song

[...]
Dmitrii Banshchikov May 20, 2021, 7:53 a.m. UTC | #2
On Thu, May 20, 2021 at 04:54:45AM +0000, Song Liu wrote:
> 
> 
> > On May 17, 2021, at 3:52 PM, Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
> > 
> > The patchset is based on the patches from David S. Miller [1] and Daniel
> > Borkmann [2].
> > 
> > The main goal of the patchset is to prepare bpfilter for iptables'
> > configuration blob parsing and code generation.
> > 
> > The patchset introduces data structures and code for matches, targets, rules
> > and tables.
> > 
> > It seems inconvenient to continue to use the same blob internally in bpfilter
> > in parts other than the blob parsing. That is why a superstructure with native
> > types is introduced. It provides a more convenient way to iterate over the blob
> > and limit the crazy structs widespread in the bpfilter code.
> > 
> 
> [...]
> 
> > 
> > 
> > 1. https://lore.kernel.org/patchwork/patch/902785/
> 
> [1] used bpfilter_ prefix on struct definitions, like "struct bpfilter_target"
> I think we should do the same in this version. (Or were there discussions on
> removing the prefix?). 

There were no discussions about it.
As those structs are private to bpfilter I assumed that it is
safe to save some characters.
I will add the prefix to all internal structs in the next
iteration.


> 
> Thanks,
> Song
> 
> [...]
>
Alexei Starovoitov May 20, 2021, 4:55 p.m. UTC | #3
On Thu, May 20, 2021 at 12:53 AM Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
>
> On Thu, May 20, 2021 at 04:54:45AM +0000, Song Liu wrote:
> >
> >
> > > On May 17, 2021, at 3:52 PM, Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
> > >
> > > The patchset is based on the patches from David S. Miller [1] and Daniel
> > > Borkmann [2].
> > >
> > > The main goal of the patchset is to prepare bpfilter for iptables'
> > > configuration blob parsing and code generation.
> > >
> > > The patchset introduces data structures and code for matches, targets, rules
> > > and tables.
> > >
> > > It seems inconvenient to continue to use the same blob internally in bpfilter
> > > in parts other than the blob parsing. That is why a superstructure with native
> > > types is introduced. It provides a more convenient way to iterate over the blob
> > > and limit the crazy structs widespread in the bpfilter code.
> > >
> >
> > [...]
> >
> > >
> > >
> > > 1. https://lore.kernel.org/patchwork/patch/902785/
> >
> > [1] used bpfilter_ prefix on struct definitions, like "struct bpfilter_target"
> > I think we should do the same in this version. (Or were there discussions on
> > removing the prefix?).
>
> There were no discussions about it.
> As those structs are private to bpfilter I assumed that it is
> safe to save some characters.
> I will add the prefix to all internal structs in the next
> iteration.

For internal types it's ok to skip the prefix otherwise it's too verbose.
In libbpf we skip 'bpf_' prefix in such cases.
Song Liu May 20, 2021, 5:56 p.m. UTC | #4
> On May 20, 2021, at 9:55 AM, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> 
> On Thu, May 20, 2021 at 12:53 AM Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
>> 
>> On Thu, May 20, 2021 at 04:54:45AM +0000, Song Liu wrote:
>>> 
>>> 
>>>> On May 17, 2021, at 3:52 PM, Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
>>>> 
>>>> The patchset is based on the patches from David S. Miller [1] and Daniel
>>>> Borkmann [2].
>>>> 
>>>> The main goal of the patchset is to prepare bpfilter for iptables'
>>>> configuration blob parsing and code generation.
>>>> 
>>>> The patchset introduces data structures and code for matches, targets, rules
>>>> and tables.
>>>> 
>>>> It seems inconvenient to continue to use the same blob internally in bpfilter
>>>> in parts other than the blob parsing. That is why a superstructure with native
>>>> types is introduced. It provides a more convenient way to iterate over the blob
>>>> and limit the crazy structs widespread in the bpfilter code.
>>>> 
>>> 
>>> [...]
>>> 
>>>> 
>>>> 
>>>> 1. https://lore.kernel.org/patchwork/patch/902785/
>>> 
>>> [1] used bpfilter_ prefix on struct definitions, like "struct bpfilter_target"
>>> I think we should do the same in this version. (Or were there discussions on
>>> removing the prefix?).
>> 
>> There were no discussions about it.
>> As those structs are private to bpfilter I assumed that it is
>> safe to save some characters.
>> I will add the prefix to all internal structs in the next
>> iteration.
> 
> For internal types it's ok to skip the prefix otherwise it's too verbose.
> In libbpf we skip 'bpf_' prefix in such cases.

Do we have plan to put some of this logic in a library? If that is the case, the 
effort now may save some pain in the future. 

Thanks,
Song
Dmitrii Banshchikov May 21, 2021, 6 a.m. UTC | #5
On Thu, May 20, 2021 at 05:56:30PM +0000, Song Liu wrote:
> 
> 
> > On May 20, 2021, at 9:55 AM, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> > 
> > On Thu, May 20, 2021 at 12:53 AM Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
> >> 
> >> On Thu, May 20, 2021 at 04:54:45AM +0000, Song Liu wrote:
> >>> 
> >>> 
> >>>> On May 17, 2021, at 3:52 PM, Dmitrii Banshchikov <me@ubique.spb.ru> wrote:
> >>>> 
> >>>> The patchset is based on the patches from David S. Miller [1] and Daniel
> >>>> Borkmann [2].
> >>>> 
> >>>> The main goal of the patchset is to prepare bpfilter for iptables'
> >>>> configuration blob parsing and code generation.
> >>>> 
> >>>> The patchset introduces data structures and code for matches, targets, rules
> >>>> and tables.
> >>>> 
> >>>> It seems inconvenient to continue to use the same blob internally in bpfilter
> >>>> in parts other than the blob parsing. That is why a superstructure with native
> >>>> types is introduced. It provides a more convenient way to iterate over the blob
> >>>> and limit the crazy structs widespread in the bpfilter code.
> >>>> 
> >>> 
> >>> [...]
> >>> 
> >>>> 
> >>>> 
> >>>> 1. https://lore.kernel.org/patchwork/patch/902785/
> >>> 
> >>> [1] used bpfilter_ prefix on struct definitions, like "struct bpfilter_target"
> >>> I think we should do the same in this version. (Or were there discussions on
> >>> removing the prefix?).
> >> 
> >> There were no discussions about it.
> >> As those structs are private to bpfilter I assumed that it is
> >> safe to save some characters.
> >> I will add the prefix to all internal structs in the next
> >> iteration.
> > 
> > For internal types it's ok to skip the prefix otherwise it's too verbose.
> > In libbpf we skip 'bpf_' prefix in such cases.
> 
> Do we have plan to put some of this logic in a library? If that is the case, the 
> effort now may save some pain in the future. 

I cannot imagine a case when we need this logic in a library.
Even if we eventually need it as these definitions are private to
bpfilter - amount of pain should be minimal.

> 
> Thanks,
> Song
>