Message ID | 20210531020157.7386-1-harishankar.vishwanathan@rutgers.edu (mailing list archive) |
---|---|
State | Accepted |
Commit | 05924717ac704a868053652b20036aa3a2273e26 |
Delegated to: | BPF |
Headers | show |
Series | [v2,bpf-next] bpf: tnums: Provably sound, faster, and more precise algorithm for tnum_mul | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | success | Link |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Clearly marked for bpf-next |
netdev/subject_prefix | success | Link |
netdev/cc_maintainers | warning | 8 maintainers not CCed: netdev@vger.kernel.org yhs@fb.com kpsingh@kernel.org daniel@iogearbox.net andrii@kernel.org kafai@fb.com john.fastabend@gmail.com songliubraving@fb.com |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | success | Errors and warnings before: 1 this patch: 1 |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 50 lines checked |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 1 this patch: 1 |
netdev/header_inline | success | Link |
On 31/05/2021 03:01, hv90@scarletmail.rutgers.edu wrote: > From: Harishankar Vishwanathan <harishankar.vishwanathan@rutgers.edu> > > This patch introduces a new algorithm for multiplication of tristate > numbers (tnums) that is provably sound. It is faster and more precise when > compared to the existing method. > > Like the existing method, this new algorithm follows the long > multiplication algorithm. The idea is to generate partial products by > multiplying each bit in the multiplier (tnum a) with the multiplicand > (tnum b), and adding the partial products after appropriately bit-shifting > them. The new algorithm, however, uses just a single loop over the bits of > the multiplier (tnum a) and accumulates only the uncertain components of > the multiplicand (tnum b) into a mask-only tnum. The following paper > explains the algorithm in more detail: https://arxiv.org/abs/2105.05398. > > A natural way to construct the tnum product is by performing a tnum > addition on all the partial products. This algorithm presents another > method of doing this: decompose each partial product into two tnums, > consisting of the values and the masks separately. The mask-sum is > accumulated within the loop in acc_m. The value-sum tnum is generated > using a.value * b.value. The tnum constructed by tnum addition of the > value-sum and the mask-sum contains all possible summations of concrete > values drawn from the partial product tnums pairwise. We prove this result > in the paper. > > Our evaluations show that the new algorithm is overall more precise > (producing tnums with less uncertain components) than the existing method. > As an illustrative example, consider the input tnums A and B. The numbers > in the paranthesis correspond to (value;mask). > > A = 000000x1 (1;2) > B = 0010011x (38;1) > A * B (existing) = xxxxxxxx (0;255) > A * B (new) = 0x1xxxxx (32;95) > > Importantly, we present a proof of soundness of the new algorithm in the > aforementioned paper. Additionally, we show that this new algorithm is > empirically faster than the existing method. > > Co-developed-by: Matan Shachnai <m.shachnai@rutgers.edu> > Signed-off-by: Matan Shachnai <m.shachnai@rutgers.edu> > Co-developed-by: Srinivas Narayana <srinivas.narayana@rutgers.edu> > Signed-off-by: Srinivas Narayana <srinivas.narayana@rutgers.edu> > Co-developed-by: Santosh Nagarakatte <santosh.nagarakatte@rutgers.edu> > Signed-off-by: Santosh Nagarakatte <santosh.nagarakatte@rutgers.edu> > Signed-off-by: Harishankar Vishwanathan <harishankar.vishwanathan@rutgers.edu> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com>
Hello: This patch was applied to bpf/bpf-next.git (refs/heads/master): On Sun, 30 May 2021 22:01:57 -0400 you wrote: > From: Harishankar Vishwanathan <harishankar.vishwanathan@rutgers.edu> > > This patch introduces a new algorithm for multiplication of tristate > numbers (tnums) that is provably sound. It is faster and more precise when > compared to the existing method. > > Like the existing method, this new algorithm follows the long > multiplication algorithm. The idea is to generate partial products by > multiplying each bit in the multiplier (tnum a) with the multiplicand > (tnum b), and adding the partial products after appropriately bit-shifting > them. The new algorithm, however, uses just a single loop over the bits of > the multiplier (tnum a) and accumulates only the uncertain components of > the multiplicand (tnum b) into a mask-only tnum. The following paper > explains the algorithm in more detail: https://arxiv.org/abs/2105.05398. > > [...] Here is the summary with links: - [v2,bpf-next] bpf: tnums: Provably sound, faster, and more precise algorithm for tnum_mul https://git.kernel.org/bpf/bpf-next/c/05924717ac70 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html
diff --git a/kernel/bpf/tnum.c b/kernel/bpf/tnum.c index ceac5281bd31..3d7127f439a1 100644 --- a/kernel/bpf/tnum.c +++ b/kernel/bpf/tnum.c @@ -111,28 +111,31 @@ struct tnum tnum_xor(struct tnum a, struct tnum b) return TNUM(v & ~mu, mu); } -/* half-multiply add: acc += (unknown * mask * value). - * An intermediate step in the multiply algorithm. +/* Generate partial products by multiplying each bit in the multiplier (tnum a) + * with the multiplicand (tnum b), and add the partial products after + * appropriately bit-shifting them. Instead of directly performing tnum addition + * on the generated partial products, equivalenty, decompose each partial + * product into two tnums, consisting of the value-sum (acc_v) and the + * mask-sum (acc_m) and then perform tnum addition on them. The following paper + * explains the algorithm in more detail: https://arxiv.org/abs/2105.05398. */ -static struct tnum hma(struct tnum acc, u64 value, u64 mask) -{ - while (mask) { - if (mask & 1) - acc = tnum_add(acc, TNUM(0, value)); - mask >>= 1; - value <<= 1; - } - return acc; -} - struct tnum tnum_mul(struct tnum a, struct tnum b) { - struct tnum acc; - u64 pi; - - pi = a.value * b.value; - acc = hma(TNUM(pi, 0), a.mask, b.mask | b.value); - return hma(acc, b.mask, a.value); + u64 acc_v = a.value * b.value; + struct tnum acc_m = TNUM(0, 0); + + while (a.value || a.mask) { + /* LSB of tnum a is a certain 1 */ + if (a.value & 1) + acc_m = tnum_add(acc_m, TNUM(0, b.mask)); + /* LSB of tnum a is uncertain */ + else if (a.mask & 1) + acc_m = tnum_add(acc_m, TNUM(0, b.value | b.mask)); + /* Note: no case for LSB is certain 0 */ + a = tnum_rshift(a, 1); + b = tnum_lshift(b, 1); + } + return tnum_add(TNUM(acc_v, 0), acc_m); } /* Note that if a and b disagree - i.e. one has a 'known 1' where the other has