mbox series

[RFC,bpf-next,v0,0/7] Unifying signed and unsigned min/max tracking

Message ID 20231108054611.19531-1-shung-hsi.yu@suse.com (mailing list archive)
Headers show
Series Unifying signed and unsigned min/max tracking | expand

Message

Shung-Hsi Yu Nov. 8, 2023, 5:46 a.m. UTC
This patchset is a proof of concept for previously discussed concept[1]
of unifying smin/smax with umin/uax for both the 32-bit and 64-bit
range[2] within bpf_reg_state. This will allow the verifier to track
both signed and unsiged ranges using just two u32 (for 32-bit range) or
u64 (for 64-bit range), instead four as we currently do.

The size reduction we gain from this probably isn't very significant.
The main benefit is in lowering of code _complexity_. The verifier
currently employs 5 value tracking mechanisms, two for 64-bit range, two
for 32-bit range, and one tnum that tracks individual bits. Exchanging
knowledge between all the bound tracking mechanisms requires a
(theoretical) 20-way synchronization[3]; with signed and unsigned
unification the verifier will only need 3 value tracking mechanism,
cutting this down to a 6-way synchronization.

The unification is possible from a theoretical standpoint[4] and there
exists implementation[5]. The challenge lies in implementing it inside
the verifier and making sure it fits well with all the logic we have in
place.

To lower the difficulty, the unified min/max tracking is implemented in
isolation, and have it correctness checked using model checking. The
model checking code can be found in this patchset as well, but is not
meant to be merged since a better method already exists[6].

So far I've managed to implement add/sub/mul operations for unified
min/max tracking, the next steps are:
- implement operation that can be used gain knowledge from conditional
  jump, e.g wrange32_intersect, wrange32_diff
- implement wrange32_from_min_max and wrange32_to_min_max so we can
  check whether this PoC works using current selftests
- implement operations for wrange64, the 64-bit counterpart of wrange32
- come up with how to exchange knowledge between wrange64 and wrange32
  (this is likely the most difficult part)
- think about how to integrate this work in a manageable manner

Feedbacks for either the code, the naming, and/or the commit messages
are all welcome.


1: https://lore.kernel.org/bpf/ZTZxoDJJbX9mrQ9w@u94a/
2: To make model checking faster I'm only working with the 32-bit ranges
   for now
3: Synchronization can goes both ways, e.g. exchanging knowledge in
   umin/umax from/to tnum count as 2-way. But
4: https://dl.acm.org/doi/10.1145/2651360
5: https://github.com/caballa/wrapped-intervals/blob/master/lib/RangeAnalysis/WrappedRange.cpp
6: https://lore.kernel.org/r/1DA1AC52-6E2D-4CDA-8216-D1DD4648AD55@cs.rutgers.edu

Shung-Hsi Yu (7):
  Add inital wrange32 definition along with checks for umin/umax
  Lift the contrain requiring start <= end
  Support tracking signed min/max
  Implement wrange32_add()
  Implement wrange32_sub()
  Implement wrange32_mul()
  (WIP) Add helper functions that transform wrange32 to and from
    smin/smax/umin/umax

 include/linux/wrange.h                        |  61 ++++
 kernel/bpf/Makefile                           |   3 +-
 kernel/bpf/wrange.c                           |  61 ++++
 tools/testing/selftests/bpf/formal/wrange.py  | 274 ++++++++++++++++++
 .../selftests/bpf/formal/wrange_add.py        |  73 +++++
 .../selftests/bpf/formal/wrange_mul.py        |  87 ++++++
 .../selftests/bpf/formal/wrange_sub.py        |  72 +++++
 7 files changed, 630 insertions(+), 1 deletion(-)
 create mode 100644 include/linux/wrange.h
 create mode 100644 kernel/bpf/wrange.c
 create mode 100755 tools/testing/selftests/bpf/formal/wrange.py
 create mode 100755 tools/testing/selftests/bpf/formal/wrange_add.py
 create mode 100755 tools/testing/selftests/bpf/formal/wrange_mul.py
 create mode 100755 tools/testing/selftests/bpf/formal/wrange_sub.py


base-commit: 856624f12b04a3f51094fa277a31a333ee81cb3f

Comments

Alexei Starovoitov Nov. 9, 2023, 7:38 p.m. UTC | #1
On Tue, Nov 7, 2023 at 9:46 PM Shung-Hsi Yu <shung-hsi.yu@suse.com> wrote:
>
> This patchset is a proof of concept for previously discussed concept[1]
> of unifying smin/smax with umin/uax for both the 32-bit and 64-bit
> range[2] within bpf_reg_state. This will allow the verifier to track
> both signed and unsiged ranges using just two u32 (for 32-bit range) or
> u64 (for 64-bit range), instead four as we currently do.
>
> The size reduction we gain from this probably isn't very significant.
> The main benefit is in lowering of code _complexity_. The verifier
> currently employs 5 value tracking mechanisms, two for 64-bit range, two
> for 32-bit range, and one tnum that tracks individual bits. Exchanging
> knowledge between all the bound tracking mechanisms requires a
> (theoretical) 20-way synchronization[3]; with signed and unsigned
> unification the verifier will only need 3 value tracking mechanism,
> cutting this down to a 6-way synchronization.
>
> The unification is possible from a theoretical standpoint[4] and there
> exists implementation[5]. The challenge lies in implementing it inside
> the verifier and making sure it fits well with all the logic we have in
> place.
>
> To lower the difficulty, the unified min/max tracking is implemented in
> isolation, and have it correctness checked using model checking. The
> model checking code can be found in this patchset as well, but is not
> meant to be merged since a better method already exists[6].
>
> So far I've managed to implement add/sub/mul operations for unified
> min/max tracking, the next steps are:
> - implement operation that can be used gain knowledge from conditional
>   jump, e.g wrange32_intersect, wrange32_diff
> - implement wrange32_from_min_max and wrange32_to_min_max so we can
>   check whether this PoC works using current selftests
> - implement operations for wrange64, the 64-bit counterpart of wrange32
> - come up with how to exchange knowledge between wrange64 and wrange32
>   (this is likely the most difficult part)
> - think about how to integrate this work in a manageable manner

Thanks for taking a stab at it.
The biggest question is how to integrate it without breaking anything.
I suspect you might need to implement all alu and branch logic
just to be able to run the tests.
It's difficult to see a path for partial/incremental addition.
The concern is that at the end this approach might hit an issue
which will make it infeasible.
So it's a big bet. Might be nice correctness and memory saving or nothing.
Certainly exciting, but proceed with caution.
Shung-Hsi Yu Nov. 11, 2023, 3:45 a.m. UTC | #2
On Thu, Nov 09, 2023 at 11:38:09AM -0800, Alexei Starovoitov wrote:
> On Tue, Nov 7, 2023 at 9:46 PM Shung-Hsi Yu <shung-hsi.yu@suse.com> wrote:
> > This patchset is a proof of concept for previously discussed concept[1]
> > of unifying smin/smax with umin/uax for both the 32-bit and 64-bit
> > range[2] within bpf_reg_state. This will allow the verifier to track
> > both signed and unsiged ranges using just two u32 (for 32-bit range) or
> > u64 (for 64-bit range), instead four as we currently do.
> >
> > The size reduction we gain from this probably isn't very significant.
> > The main benefit is in lowering of code _complexity_. The verifier
> > currently employs 5 value tracking mechanisms, two for 64-bit range, two
> > for 32-bit range, and one tnum that tracks individual bits. Exchanging
> > knowledge between all the bound tracking mechanisms requires a
> > (theoretical) 20-way synchronization[3]; with signed and unsigned
> > unification the verifier will only need 3 value tracking mechanism,
> > cutting this down to a 6-way synchronization.
> >
> > The unification is possible from a theoretical standpoint[4] and there
> > exists implementation[5]. The challenge lies in implementing it inside
> > the verifier and making sure it fits well with all the logic we have in
> > place.
> >
> > To lower the difficulty, the unified min/max tracking is implemented in
> > isolation, and have it correctness checked using model checking. The
> > model checking code can be found in this patchset as well, but is not
> > meant to be merged since a better method already exists[6].
> >
> > So far I've managed to implement add/sub/mul operations for unified
> > min/max tracking, the next steps are:
> > - implement operation that can be used gain knowledge from conditional
> >   jump, e.g wrange32_intersect, wrange32_diff
> > - implement wrange32_from_min_max and wrange32_to_min_max so we can
> >   check whether this PoC works using current selftests
> > - implement operations for wrange64, the 64-bit counterpart of wrange32
> > - come up with how to exchange knowledge between wrange64 and wrange32
> >   (this is likely the most difficult part)
> > - think about how to integrate this work in a manageable manner
> 
> Thanks for taking a stab at it.
> The biggest question is how to integrate it without breaking anything.
> I suspect you might need to implement all alu and branch logic
> just to be able to run the tests.

Once the wrange32_{to,from}_min_max() helpers in patch 7 is implemented, I
should be able to swap out individual alu operation while keeping
bpf_reg_state untouched. E.g. for addition in 32-bit

  static void scalar32_min_max_add(struct bpf_reg_state *dst_reg,
                                   struct bpf_reg_state *src_reg)
  {
      struct wrange32 a = wrange32_from_min_max(dst_reg->smin, dst_reg->smax,
                                                dst_reg->umin, dst_reg->umax);
      struct wrange32 b = wrange32_from_min_max(src_reg->smin, src_reg->smax,
                                                src_reg->umin, src_reg->umax);
      
      wrange32_to_min_max(wrange32_add(a, b), &dst_reg->smin, &dst_reg->smax,
                          &dst_reg->umin, &dst_reg->umax);
  }

and get current tests to run on top of the new algorithm. This won't cover
every aspect, but should be enough as a first taste on how well (or unwell)
the integration will be.

These helpers also can help to create finer intermediate steps for smoother
integration; something that's added in the beginning to aid the transition,
but removed after the transition is done.

> It's difficult to see a path for partial/incremental addition.
> The concern is that at the end this approach might hit an issue
> which will make it infeasible.

Agree. While the helpers above can aid with integration, I do not see a safe
path for partial addition. At least not before everything until
reg_bound_sync() proofs to work should it be considered.
Still a long way to go.

> So it's a big bet. Might be nice correctness and memory saving or nothing.
> Certainly exciting, but proceed with caution.

Having enough optimism and attachment to tackle this but not too much to the
point of overlooking its flaws is certainly a challenging task.
Will try my best :)

Thanks for the feedbacks!
Shung-Hsi Yu Nov. 11, 2023, 7:54 a.m. UTC | #3
On Sat, Nov 11, 2023 at 11:45:22AM +0800, Shung-Hsi Yu wrote:
> On Thu, Nov 09, 2023 at 11:38:09AM -0800, Alexei Starovoitov wrote:
> > On Tue, Nov 7, 2023 at 9:46 PM Shung-Hsi Yu <shung-hsi.yu@suse.com> wrote:
> > > This patchset is a proof of concept for previously discussed concept[1]
> > > of unifying smin/smax with umin/uax for both the 32-bit and 64-bit
> > > range[2] within bpf_reg_state. This will allow the verifier to track
> > > both signed and unsiged ranges using just two u32 (for 32-bit range) or
> > > u64 (for 64-bit range), instead four as we currently do.
> > >
> > > The size reduction we gain from this probably isn't very significant.
> > > The main benefit is in lowering of code _complexity_. The verifier
> > > currently employs 5 value tracking mechanisms, two for 64-bit range, two
> > > for 32-bit range, and one tnum that tracks individual bits. Exchanging
> > > knowledge between all the bound tracking mechanisms requires a
> > > (theoretical) 20-way synchronization[3]; with signed and unsigned
> > > unification the verifier will only need 3 value tracking mechanism,
> > > cutting this down to a 6-way synchronization.
> > >
> > > The unification is possible from a theoretical standpoint[4] and there
> > > exists implementation[5]. The challenge lies in implementing it inside
> > > the verifier and making sure it fits well with all the logic we have in
> > > place.
> > >
> > > To lower the difficulty, the unified min/max tracking is implemented in
> > > isolation, and have it correctness checked using model checking. The
> > > model checking code can be found in this patchset as well, but is not
> > > meant to be merged since a better method already exists[6].
> > >
> > > So far I've managed to implement add/sub/mul operations for unified
> > > min/max tracking, the next steps are:
> > > - implement operation that can be used gain knowledge from conditional
> > >   jump, e.g wrange32_intersect, wrange32_diff
> > > - implement wrange32_from_min_max and wrange32_to_min_max so we can
> > >   check whether this PoC works using current selftests
> > > - implement operations for wrange64, the 64-bit counterpart of wrange32
> > > - come up with how to exchange knowledge between wrange64 and wrange32
> > >   (this is likely the most difficult part)
> > > - think about how to integrate this work in a manageable manner
> > 
> > Thanks for taking a stab at it.
> > The biggest question is how to integrate it without breaking anything.
> > I suspect you might need to implement all alu and branch logic
> > just to be able to run the tests.
> 
> Once the wrange32_{to,from}_min_max() helpers in patch 7 is implemented, I
> should be able to swap out individual alu operation while keeping
> bpf_reg_state untouched. E.g. for addition in 32-bit
> 
>   static void scalar32_min_max_add(struct bpf_reg_state *dst_reg,
>                                    struct bpf_reg_state *src_reg)
>   {
>       struct wrange32 a = wrange32_from_min_max(dst_reg->smin, dst_reg->smax,
>                                                 dst_reg->umin, dst_reg->umax);
>       struct wrange32 b = wrange32_from_min_max(src_reg->smin, src_reg->smax,
>                                                 src_reg->umin, src_reg->umax);
>       
>       wrange32_to_min_max(wrange32_add(a, b), &dst_reg->smin, &dst_reg->smax,
>                           &dst_reg->umin, &dst_reg->umax);
>   }
> 
> and get current tests to run on top of the new algorithm. This won't cover
> every aspect, but should be enough as a first taste on how well (or unwell)
> the integration will be.
> 
> These helpers also can help to create finer intermediate steps for smoother
> integration; something that's added in the beginning to aid the transition,
> but removed after the transition is done.
> 
> > It's difficult to see a path for partial/incremental addition.
> > The concern is that at the end this approach might hit an issue
> > which will make it infeasible.
> 
> Agree. While the helpers above can aid with integration, I do not see a safe
> path for partial addition. At least not before everything until
> reg_bound_sync() proofs to work should it be considered.
> Still a long way to go.

Meant to say "everything until, and including, reg_bound_sync() are
proofed to work", since it is likely the hardest part.

[...]