[v5,bpf-next,2/3] bpf: Relax precision marking in open coded iters and may_goto loop.

From: Alexei Starovoitov <ast@kernel.org>

From: Alexei Starovoitov <ast@kernel.org>

v1->v2->v3:
- Algorithm changed completely between revisions:
  v1: https://lore.kernel.org/bpf/20240522024713.59136-1-alexei.starovoitov@gmail.com/
  v2: https://lore.kernel.org/bpf/20240523064219.42465-1-alexei.starovoitov@gmail.com/
v3->v4:
- Fixed widening for Rx < Ry case and added more tests
  v4: https://lore.kernel.org/bpf/20240601034211.63962-1-alexei.starovoitov@gmail.com/
v4->v5:
- Algorithm changed again:
. Widen either lower or upper scalar range instead of both. See widen_reg().
. Recognize predicted == or != and convert to <, > when possible
  and follow that branch only after propagating precision.
. Apply to scalar constant only.
These changes made big difference for arena progs. In v4 arena tests were in the noise.

Motivations for the patch
-------------------------
1.
Open coded iterators and may_goto is a great mechanism to implement loops,
but counted loops are problematic. For example:
  for (i = 0; i < 100 && can_loop; i++)
is verified as a bounded loop, since i < 100 condition forces the verifier to
mark 'i' as precise and loop states at different iterations are not equivalent.
That removes the benefit of open coded iterators and may_goto.
The workaround is to do:
  int zero = 0; /* global or volatile variable */
  for (i = zero; i < 100 && can_loop; i++)
to hide from the verifier the value of 'i'.
It's unnatural and so far users didn't learn such odd programming pattern.

This patch aims to improve the verifier to support
  for (i = 0; i < 100000 && can_loop; i++)
as open coded iter loop (when 'i' doesn't need to be precise).

Note, i = zero workaround disables bounded loop logic.
Open coded iterator bpf_for(i, 0, 100) also disables bounded loop logic,
hence apply heuristic in this patch only for iters and may_goto.

2.
Arena based program spent significant amount of verification time
propagating precision due to predicted conditional branches,
but this precision is useless work, since arena access doesn't
require precision unlike regular map access.
The difference before/after:
File                    Insns (A)  Insns (B)  Insns     (DIFF)
----------------------  ---------  ---------  ----------------
arena_htab.bpf.o            18656        781  -17875 (-95.81%)
arena_htab_asm.bpf.o        18523        598  -17925 (-96.77%)
arena_list.bpf.o             1685       1780      +95 (+5.64%)

Algorithm
---------
First of all:
   if (is_may_goto_insn_at(env, insn_idx)) {
+          update_loop_entry(cur, &sl->state);
           if (states_equal(env, &sl->state, cur, RANGE_WITHIN)) {
-                  update_loop_entry(cur, &sl->state);

It changes the definition of the verifier states loop.
Previously, we considered a state loop to be such a sequence of states
Si -> ... -> Sj -> ... -> Sk that states_equal(Si, Sk, RANGE_WITHIN)
is true.

With this change Si -> ... -> Sj -> ... Sk is a loop if call sites and
instruction pointers for Si and Sk match.

Whether or not Si and Sk are in the loop influences two things:
(a) if exact comparison is needed for states cache;
(b) if widening transformation could be applied to some scalars.

All pairs (Si, Sk) marked as a loop using old definition would be
marked as such using new definition (in a addition to some new pairs).

Hence it is safe to apply (a) and (b) in strictly more cases.

Note that update_loop_entry() relies on the following properties:
- every state in the current DFS path (except current)
  has branches > 0;
- states not in the DFS path are either:
  - in explored_states, are fully explored and have branches == 0;
  - in env->stack, are not yet explored and have branches == 0
    (and also not reachable from is_state_visited()).

With that the get_loop_entry() can be used to gate is_branch_taken() logic.
When the verifier sees 'r1 > 1000' inside the loop and it can predict it
instead of marking r1 as precise it widens both branches, so r1 becomes
[0, 1000] in fallthrough and [1001, UMAX] in other_branch.

Consider the loop:
    bpf_for_each(...) {
       if (r1 > 1000)
          break;

       arr[r1] = ..;
    }
At arr[r1] access the r1 is bounded and the loop can quickly converge.

Unfortunately compilers (both GCC and LLVM) often optimize loop exit
condition to equality, so
 for (i = 0; i < 100; i++) arr[i] = 1
becomes
 for (i = 0; i != 100; i++) arr[1] = 1

Hence treat != and == conditions specially in the verifier. When equality
condition is predicted check whether dst is < or > than src. Example:
  r1 = 10
  goto L1
L2:
  arr[r1] = 1
  r1++
L1:
  if r1 != 100 goto L2

This branch will be predicted as fallthrough, check that r1 < 100
and if so, widen r1 = [10, 99] in fallthrough and
r1 = [100, UMAX] in other branch.

With that the users can use 'for (i = 0; ...' pattern everywhere
and many i = zero workarounds can be removed.

The tests with open coded iters see dramatic improvement. The rest are noise.
File                  Program                          Insns (A)  Insns (B)  Insns       (DIFF)  Verdict (A)  Verdict (B)
--------------------  -------------------------------  ---------  ---------  ------------------  -----------  -----------
iters_task_vma.bpf.o  iter_task_vma_for_each               22043        132    -21911 (-99.40%)  success      success
iters_task_vma.bpf.o  iter_task_vma_for_each_eq            22043        131    -21912 (-99.41%)  success      success
iters_task_vma.bpf.o  loop_inside_iter                   1000001        148   -999853 (-99.99%)  failure      success
iters_task_vma.bpf.o  loop_inside_iter_signed            1000001        148   -999853 (-99.99%)  failure      success
iters_task_vma.bpf.o  loop_inside_iter_subprog           1000001         64   -999937 (-99.99%)  failure      success
iters_task_vma.bpf.o  loop_inside_iter_volatile_limit    1000001        134   -999867 (-99.99%)  failure      success

The bottom 4 tests were unverifiable before due to limitations of bounded loop
logic.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 kernel/bpf/verifier.c | 330 ++++++++++++++++++++++++++++++++++++++----
 1 file changed, 302 insertions(+), 28 deletions(-)

Message ID	20240606005425.38285-2-alexei.starovoitov@gmail.com (mailing list archive)
State	Changes Requested
Delegated to:	BPF
Headers	show Received: from mail-pj1-f41.google.com (mail-pj1-f41.google.com [209.85.216.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A83C0D528 for <bpf@vger.kernel.org>; Thu, 6 Jun 2024 00:54:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717635276; cv=none; b=bV3Eq++cWiBbpwCNXTwTY0+v2brc5qKvfq9aCzmRNQpuXJ2Zkqv9bH7jzzLM36+N6DEuKLPpu6jTrfPSbAchJvilw5kLuqsrKudmlMKBj6kEsVXQXOHPUizb9aMGDE+DzZ/z/7Buo7vvv8sJsyJvSzkyGBEo21wlTrAuXPJW5K4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717635276; c=relaxed/simple; bh=gteROIpTVyoEaqhQdHXZv7b+6Iu7goooQ2Q3rVlw5PQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=GdmfHAGZVj+KrlMxMp4+eW3hwweizfmBJwaM5iYvyRP2aMVuYsene0aRCSReRGcoVIC6oua3gQnNY4tmQfIF5MLaYias3YVgWO9po4UtDZ0yOegzIvY1eiyBFp/3N/L1uExQbVMwAIOjdeuB6PrJcoAY+7FbjImnMiKOV9hIFLo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=cmu/j9ht; arc=none smtp.client-ip=209.85.216.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="cmu/j9ht" Received: by mail-pj1-f41.google.com with SMTP id 98e67ed59e1d1-2c1a99b75d8so313197a91.3 for <bpf@vger.kernel.org>; Wed, 05 Jun 2024 17:54:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1717635273; x=1718240073; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=L936NufUn/2BJfFq8Tp98ADzgIgCfWnlpdPPNS1eijY=; b=cmu/j9htq1togYc65aayLgX/jx/RWi7m5Cgf1/kiHKToTAvHSaWHlq+AtKe0p9N10L xw/kttjun6l+3gvxtUZ6NAlXeNwqAPZ0Xq71C4fuZs20YUymA20bL/UK06qaSvqwDORU DiuHtWhBjcRzVkF1kDmRl3zZOZrYC9XTO8oITgW4spsrMV2WkOfAAczFP/McYFiwZ2CV PHpOBYipb1HAceu+9hUaJvtoaJxFfPz9mfhj2G+k8UzO95U0PLzbYrOpBNY21GrutjGH o7yp7LQ5mtmJC9ZvxCE6/DzwrZLEi7bk/CQVwZOpbw1g4Xegq/7Rb1xzgJk05coLTO12 W7Yw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717635273; x=1718240073; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=L936NufUn/2BJfFq8Tp98ADzgIgCfWnlpdPPNS1eijY=; b=ccqm6MI9HSuYpIc+Kily4MGh2Fjv/K+W7U35qEpH1L8ok2JivkB95OH8hRSn3lrqN/ zWzjSuHdreZxIjmn+Ng0XNrOFXC909rAM8cJtZPFSxuHskgMIBJxhlFAm40GqZh/1G2S wF9dBuK0OnwYWcgAtIorsUu/r+3i2fCl/hgle349MkVB8brwitUel+9JmGawCDz9m9nq j+owK0WwyJ2jQrbNm+dPobHazKBCCWwJhye8W6nHbWbdSqbNcv4i+s0Hhbt4s9ppgBFq 8JM3f1TRtV61n/LtzIgeWyxPXjvcez8VSTGDYmtub+HKGL0GSHHQONut6etJzdq2oWY2 DboA== X-Gm-Message-State: AOJu0Yxi/e8vXd0gKbFdFDDX/qUCZg2x7hMVmIBaawSiZJ5nRFPGniyK KE4baTO5MstLpwN2NW087nAVBe0G3mjohlXAhI8ON+b5DIk9K5lAEByM4A== X-Google-Smtp-Source: AGHT+IFR9JFmdFxO5kYhRII6Rg6nD9jy8H7spxj8+cT/IFjVcidUALur+2pBigia9L0D3/MuHEu8Nw== X-Received: by 2002:a17:90a:3489:b0:2b5:6e92:1096 with SMTP id 98e67ed59e1d1-2c27db19e2amr4119580a91.28.1717635272831; Wed, 05 Jun 2024 17:54:32 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:5ca2]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2c2806399c5sm2178658a91.9.2024.06.05.17.54.31 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 05 Jun 2024 17:54:32 -0700 (PDT) From: Alexei Starovoitov <alexei.starovoitov@gmail.com> To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org, memxor@gmail.com, eddyz87@gmail.com, kernel-team@fb.com Subject: [PATCH v5 bpf-next 2/3] bpf: Relax precision marking in open coded iters and may_goto loop. Date: Wed, 5 Jun 2024 17:54:24 -0700 Message-Id: <20240606005425.38285-2-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20240606005425.38285-1-alexei.starovoitov@gmail.com> References: <20240606005425.38285-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: <bpf.vger.kernel.org> List-Subscribe: <mailto:bpf+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:bpf+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Delegate: bpf@iogearbox.net
Series	[v5,bpf-next,1/3] bpf: Relax tuple len requirement for sk helpers. \| expand [v5,bpf-next,1/3] bpf: Relax tuple len requirement for sk helpers. [v5,bpf-next,2/3] bpf: Relax precision marking in open coded iters and may_goto loop. [v5,bpf-next,3/3] selftests/bpf: Remove i = zero workaround and add new tests.

Context	Check	Description
bpf/vmtest-bpf-next-PR	fail	PR summary
bpf/vmtest-bpf-next-VM_Test-0	success	Logs for Lint
bpf/vmtest-bpf-next-VM_Test-2	success	Logs for Unittests
bpf/vmtest-bpf-next-VM_Test-1	success	Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-3	success	Logs for Validate matrix.py
bpf/vmtest-bpf-next-VM_Test-5	success	Logs for aarch64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-4	success	Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-9	success	Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-10	success	Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-11	success	Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-12	success	Logs for s390x-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-13	success	Logs for s390x-gcc / test (test_maps, false, 360) / test_maps on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-16	success	Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-17	success	Logs for s390x-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-19	success	Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18	success	Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-20	success	Logs for x86_64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-28	success	Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-29	success	Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17-O2
bpf/vmtest-bpf-next-VM_Test-35	success	Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-33	success	Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-34	success	Logs for x86_64-llvm-17 / veristat
bpf/vmtest-bpf-next-VM_Test-36	success	Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18-O2
bpf/vmtest-bpf-next-VM_Test-42	success	Logs for x86_64-llvm-18 / veristat
bpf/vmtest-bpf-next-VM_Test-6	success	Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-21	success	Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23	success	Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-22	success	Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-24	success	Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-25	success	Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-27	success	Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-26	success	Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-30	success	Logs for x86_64-llvm-17 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-31	success	Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-32	success	Logs for x86_64-llvm-17 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-37	success	Logs for x86_64-llvm-18 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-41	success	Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-8	success	Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-7	success	Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-14	fail	Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-15	fail	Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-39	success	Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-38	success	Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-40	success	Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18
netdev/tree_selection	success	Clearly marked for bpf-next
netdev/apply	fail	Patch does not apply to bpf-next-0

[v5,bpf-next,2/3] bpf: Relax precision marking in open coded iters and may_goto loop.

Checks

Commit Message

Comments

Patch