From patchwork Fri Mar 15 14:29:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Tissoires X-Patchwork-Id: 13593531 X-Patchwork-Delegate: bpf@iogearbox.net Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 13FDF3B795; Fri, 15 Mar 2024 14:29:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710512982; cv=none; b=ktrWdMZ11SU3xSKePInIZN2geum1vmFlSEmaLGSNt5KRugJn2Y63OujG8Fa1/yqvxzsmxzUFajwHb4M365QtkYec474+TEroAnGVP1pskjmJoxoHtKpbsUnv3UHxn7BxEDsk/gy6vjUoMyqYQ0WRQDnBRgKnT6vAI1K7nCk1N8g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710512982; c=relaxed/simple; bh=+8iv05gWhBn81v7n8L6hi2pnOLtKGBdEuyaUm2XdM+w=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=NVQdjMxiKBD7AIYsK8C0LnJLBcU+V3kVCxexao0a/Wv7yrgwkQNtXuOtjfzL4YQIJNFeSQQyfTqxgajWJz7Z+a5xW5WhdCs0KTXUx5wh0GuDSsQyzE5f8uoxHGBET0OOcbxh+cO7AdQeAhQqSePxaN5amZ1b+W3QF67DXAHv8uk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=bVDW40wP; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="bVDW40wP" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0D26DC43601; Fri, 15 Mar 2024 14:29:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1710512981; bh=+8iv05gWhBn81v7n8L6hi2pnOLtKGBdEuyaUm2XdM+w=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=bVDW40wPVZv/NqDUt3y85cWLXKVihLrqz4j7zsb9GoEMS1oy3rxb/sk4hK3mA5fqx BRIFv3kPISQgT148dqNY+RArBJppPX0R6kGWNkXebQDD17WWOMGUyJOs5/zJ4ue2TJ opLHIDAuDi2lkcKt1e93kPRv5l2MVN3rThZO3rgUV2LVWYaiKvqRjyhwG8VHuSPteQ tCArBNF07wk44+RtA0lCphaCmg39mP6A+iU1k/4Hb5EXdV/OfQqNJFv8ga9BfzO5ZB d3z4CK8nAReE3yFnSshXuZnLLhsLCrUHbKgPOlJTidNa2etCQ5hWOwpbBsdXjDD/tc QY5JPrkomAJ3w== From: Benjamin Tissoires Date: Fri, 15 Mar 2024 15:29:25 +0100 Subject: [PATCH bpf-next v4 1/6] bpf/helpers: introduce sleepable bpf_timers Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240315-hid-bpf-sleepable-v4-1-5658f2540564@kernel.org> References: <20240315-hid-bpf-sleepable-v4-0-5658f2540564@kernel.org> In-Reply-To: <20240315-hid-bpf-sleepable-v4-0-5658f2540564@kernel.org> To: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Mykola Lysenko , Shuah Khan Cc: Benjamin Tissoires , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org X-Mailer: b4 0.12.4 X-Developer-Signature: v=1; a=ed25519-sha256; t=1710512973; l=7890; i=bentiss@kernel.org; s=20230215; h=from:subject:message-id; bh=+8iv05gWhBn81v7n8L6hi2pnOLtKGBdEuyaUm2XdM+w=; b=Is9ulqHkGUYBrx9fC68zYkVcrnHLCNTT/JjVRbbgmFY+2RRyPXraydcb0IDr8mpjlCjnx8QT7 ei0DFgnmUM/CMqHQewJVMifD49DWE23kTsGty3Tw0RI9mHtfI9szlXB X-Developer-Key: i=bentiss@kernel.org; a=ed25519; pk=7D1DyAVh6ajCkuUTudt/chMuXWIJHlv2qCsRkIizvFw= X-Patchwork-Delegate: bpf@iogearbox.net They are implemented as a workqueue, which means that there are no guarantees of timing nor ordering. Signed-off-by: Benjamin Tissoires --- changes in v4: - dropped __bpf_timer_compute_key() - use a spin_lock instead of a semaphore - ensure bpf_timer_cancel_and_free is not complaining about non sleepable context and use cancel_work() instead of cancel_work_sync() - return -EINVAL if a delay is given to bpf_timer_start() with BPF_F_TIMER_SLEEPABLE changes in v3: - extracted the implementation in bpf_timer only, without bpf_timer_set_sleepable_cb() - rely on schedule_work() only, from bpf_timer_start() - add semaphore to ensure bpf_timer_work_cb() is accessing consistent data changes in v2 (compared to the one attaches to v1 0/9): - make use of a kfunc - add a (non-used) BPF_F_TIMER_SLEEPABLE - the callback is *not* called, it makes the kernel crashes --- include/uapi/linux/bpf.h | 4 +++ kernel/bpf/helpers.c | 86 ++++++++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 88 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 3c42b9f1bada..b90def29d796 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -7461,10 +7461,14 @@ struct bpf_core_relo { * - BPF_F_TIMER_ABS: Timeout passed is absolute time, by default it is * relative to current time. * - BPF_F_TIMER_CPU_PIN: Timer will be pinned to the CPU of the caller. + * - BPF_F_TIMER_SLEEPABLE: Timer will run in a sleepable context, with + * no guarantees of ordering nor timing (consider this as being just + * offloaded immediately). */ enum { BPF_F_TIMER_ABS = (1ULL << 0), BPF_F_TIMER_CPU_PIN = (1ULL << 1), + BPF_F_TIMER_SLEEPABLE = (1ULL << 2), }; /* BPF numbers iterator state */ diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index a89587859571..38de73a9df83 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -1094,14 +1094,20 @@ const struct bpf_func_proto bpf_snprintf_proto = { * bpf_timer_cancel() cancels the timer and decrements prog's refcnt. * Inner maps can contain bpf timers as well. ops->map_release_uref is * freeing the timers when inner map is replaced or deleted by user space. + * + * sleepable_lock protects only the setup of the workqueue, not the callback + * itself. This is done to ensure we don't run concurrently a free of the + * callback or the associated program. */ struct bpf_hrtimer { struct hrtimer timer; + struct work_struct work; struct bpf_map *map; struct bpf_prog *prog; void __rcu *callback_fn; void *value; struct rcu_head rcu; + spinlock_t sleepable_lock; }; /* the actual struct hidden inside uapi struct bpf_timer */ @@ -1114,6 +1120,49 @@ struct bpf_timer_kern { struct bpf_spin_lock lock; } __attribute__((aligned(8))); +static void bpf_timer_work_cb(struct work_struct *work) +{ + struct bpf_hrtimer *t = container_of(work, struct bpf_hrtimer, work); + struct bpf_map *map = t->map; + bpf_callback_t callback_fn; + void *value = t->value; + unsigned long flags; + void *key; + u32 idx; + + BTF_TYPE_EMIT(struct bpf_timer); + + spin_lock_irqsave(&t->sleepable_lock, flags); + + callback_fn = READ_ONCE(t->callback_fn); + if (!callback_fn) { + spin_unlock_irqrestore(&t->sleepable_lock, flags); + return; + } + + if (map->map_type == BPF_MAP_TYPE_ARRAY) { + struct bpf_array *array = container_of(map, struct bpf_array, map); + + /* compute the key */ + idx = ((char *)value - array->value) / array->elem_size; + key = &idx; + } else { /* hash or lru */ + key = value - round_up(map->key_size, 8); + } + + /* prevent the callback to be freed by bpf_timer_cancel() while running + * so we can release the sleepable lock + */ + bpf_prog_inc(t->prog); + + spin_unlock_irqrestore(&t->sleepable_lock, flags); + + callback_fn((u64)(long)map, (u64)(long)key, (u64)(long)value, 0, 0); + /* The verifier checked that return value is zero. */ + + bpf_prog_put(t->prog); +} + static DEFINE_PER_CPU(struct bpf_hrtimer *, hrtimer_running); static enum hrtimer_restart bpf_timer_cb(struct hrtimer *hrtimer) @@ -1192,6 +1241,8 @@ BPF_CALL_3(bpf_timer_init, struct bpf_timer_kern *, timer, struct bpf_map *, map t->prog = NULL; rcu_assign_pointer(t->callback_fn, NULL); hrtimer_init(&t->timer, clockid, HRTIMER_MODE_REL_SOFT); + INIT_WORK(&t->work, bpf_timer_work_cb); + spin_lock_init(&t->sleepable_lock); t->timer.function = bpf_timer_cb; WRITE_ONCE(timer->timer, t); /* Guarantee the order between timer->timer and map->usercnt. So @@ -1237,6 +1288,7 @@ BPF_CALL_3(bpf_timer_set_callback, struct bpf_timer_kern *, timer, void *, callb ret = -EINVAL; goto out; } + spin_lock(&t->sleepable_lock); if (!atomic64_read(&t->map->usercnt)) { /* maps with timers must be either held by user space * or pinned in bpffs. Otherwise timer might still be @@ -1263,6 +1315,8 @@ BPF_CALL_3(bpf_timer_set_callback, struct bpf_timer_kern *, timer, void *, callb } rcu_assign_pointer(t->callback_fn, callback_fn); out: + if (t) + spin_unlock(&t->sleepable_lock); __bpf_spin_unlock_irqrestore(&timer->lock); return ret; } @@ -1283,8 +1337,12 @@ BPF_CALL_3(bpf_timer_start, struct bpf_timer_kern *, timer, u64, nsecs, u64, fla if (in_nmi()) return -EOPNOTSUPP; - if (flags & ~(BPF_F_TIMER_ABS | BPF_F_TIMER_CPU_PIN)) + if (flags & ~(BPF_F_TIMER_ABS | BPF_F_TIMER_CPU_PIN | BPF_F_TIMER_SLEEPABLE)) return -EINVAL; + + if ((flags & BPF_F_TIMER_SLEEPABLE) && nsecs) + return -EINVAL; + __bpf_spin_lock_irqsave(&timer->lock); t = timer->timer; if (!t || !t->prog) { @@ -1300,7 +1358,10 @@ BPF_CALL_3(bpf_timer_start, struct bpf_timer_kern *, timer, u64, nsecs, u64, fla if (flags & BPF_F_TIMER_CPU_PIN) mode |= HRTIMER_MODE_PINNED; - hrtimer_start(&t->timer, ns_to_ktime(nsecs), mode); + if (flags & BPF_F_TIMER_SLEEPABLE) + schedule_work(&t->work); + else + hrtimer_start(&t->timer, ns_to_ktime(nsecs), mode); out: __bpf_spin_unlock_irqrestore(&timer->lock); return ret; @@ -1348,13 +1409,22 @@ BPF_CALL_1(bpf_timer_cancel, struct bpf_timer_kern *, timer) ret = -EDEADLK; goto out; } + spin_lock(&t->sleepable_lock); drop_prog_refcnt(t); + spin_unlock(&t->sleepable_lock); out: __bpf_spin_unlock_irqrestore(&timer->lock); /* Cancel the timer and wait for associated callback to finish * if it was running. */ ret = ret ?: hrtimer_cancel(&t->timer); + + /* also cancel the sleepable work, but *do not* wait for + * it to finish if it was running as we might not be in a + * sleepable context + */ + ret = ret ?: cancel_work(&t->work); + rcu_read_unlock(); return ret; } @@ -1383,11 +1453,13 @@ void bpf_timer_cancel_and_free(void *val) t = timer->timer; if (!t) goto out; + spin_lock(&t->sleepable_lock); drop_prog_refcnt(t); /* The subsequent bpf_timer_start/cancel() helpers won't be able to use * this timer, since it won't be initialized. */ WRITE_ONCE(timer->timer, NULL); + spin_unlock(&t->sleepable_lock); out: __bpf_spin_unlock_irqrestore(&timer->lock); if (!t) @@ -1410,6 +1482,16 @@ void bpf_timer_cancel_and_free(void *val) */ if (this_cpu_read(hrtimer_running) != t) hrtimer_cancel(&t->timer); + + /* also cancel the sleepable work, but *do not* wait for + * it to finish if it was running as we might not be in a + * sleepable context. Same reason as above, it's fine to + * free 't': the subprog callback will never access it anymore + * and can not reschedule itself since timer->timer = NULL was + * already done. + */ + cancel_work(&t->work); + kfree_rcu(t, rcu); } From patchwork Fri Mar 15 14:29:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Tissoires X-Patchwork-Id: 13593532 X-Patchwork-Delegate: bpf@iogearbox.net Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2E7423C68A; Fri, 15 Mar 2024 14:29:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710512986; cv=none; b=LlyPjjYoWAXjfxRfjk/f+R7RT19ReUPRzwMYuRJZzqPQxCJAAy7lha2LAK03sznshR7wEWYTquU6c1qV2nxIMuEXrbPbXqeeELqS2cx4K7CRJpq6H3RnmEl1rUeFCdvDD6ZDyxV/XbfYx+xeqi0Ud8hGgpvsYINcLz4dzyo9M3o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710512986; c=relaxed/simple; bh=4PSR833wK54GZMHK7oT0FJLSGU3g6RHprWT+gjzjz4I=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=WXrt9Z52irwG1CZxD1VQ9P6bKFpSvP0quhFf4+3z0h66qtNPruigiuXPmbF5thhvWuGp4FTG8DCFZjx2etFafc+cffNH3xebW5dlmhZAO9r4DBDiKpII+JhAfYkZJIW0xzKV0p+S4ej2/Uc97kZ8yKCjeNbyxiNjJ8U6+LKVn3U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=oeT/ujXO; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="oeT/ujXO" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1DD7DC433C7; Fri, 15 Mar 2024 14:29:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1710512985; bh=4PSR833wK54GZMHK7oT0FJLSGU3g6RHprWT+gjzjz4I=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=oeT/ujXOzMWZS0hGAflCnWfXFC4EXS6R2Ft/UU7yNwScE1pgAFMs9L86FwB0FhRvg BN4hNZhPxiypHob04e5o03ZOo76Xqh4WUYODrdw76dXX7wPKV43liuIOoF+djtGqYz a0m6dsdUk0RNoT4UIcbQPAfaC/ftuOnFayjnQExxnW0aiJcdS0Td/W5bCw2TvMyMgQ q7pSMnfQ4pnhbK2OV7VgL9nBRg8fzbt05SG0SJo/MemK8o9vIPzhtP3WlPFu6SmJcC qC54PP5ioc5y8zKQ9egFLMq/pCPwVr0nOzUSs5cn6/vOdMILfpx379ddUB3ynOM7xl i+IYMwGJRMaFg== From: Benjamin Tissoires Date: Fri, 15 Mar 2024 15:29:26 +0100 Subject: [PATCH bpf-next v4 2/6] bpf/verifier: add bpf_timer as a kfunc capable type Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240315-hid-bpf-sleepable-v4-2-5658f2540564@kernel.org> References: <20240315-hid-bpf-sleepable-v4-0-5658f2540564@kernel.org> In-Reply-To: <20240315-hid-bpf-sleepable-v4-0-5658f2540564@kernel.org> To: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Mykola Lysenko , Shuah Khan Cc: Benjamin Tissoires , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org X-Mailer: b4 0.12.4 X-Developer-Signature: v=1; a=ed25519-sha256; t=1710512973; l=2798; i=bentiss@kernel.org; s=20230215; h=from:subject:message-id; bh=4PSR833wK54GZMHK7oT0FJLSGU3g6RHprWT+gjzjz4I=; b=EUQIDqnCUQ4IzyqEY6pGPvT9gQP7IGI3pAKO+L6ei9K4rfJTvDWADgaqn0NuOvlLe445PUhwe GIKavhQXNO/BYatc0dBS/6XHpG/caDvqWDyqhZd2P28n0ZwVc1WbjEB X-Developer-Key: i=bentiss@kernel.org; a=ed25519; pk=7D1DyAVh6ajCkuUTudt/chMuXWIJHlv2qCsRkIizvFw= X-Patchwork-Delegate: bpf@iogearbox.net We need to extend the bpf_timer API, but the way forward relies on kfuncs. So make bpf_timer known for kfuncs from the verifier PoV Signed-off-by: Benjamin Tissoires --- changes in v4: - enforce KF_ARG_PTR_TO_TIMER to be of type PTR_TO_MAP_VALUE new in v3 (split from v2 02/10) --- kernel/bpf/verifier.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 63749ad5ac6b..1483ebc0ee73 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -10826,6 +10826,7 @@ enum { KF_ARG_LIST_NODE_ID, KF_ARG_RB_ROOT_ID, KF_ARG_RB_NODE_ID, + KF_ARG_TIMER_ID, }; BTF_ID_LIST(kf_arg_btf_ids) @@ -10834,6 +10835,7 @@ BTF_ID(struct, bpf_list_head) BTF_ID(struct, bpf_list_node) BTF_ID(struct, bpf_rb_root) BTF_ID(struct, bpf_rb_node) +BTF_ID(struct, bpf_timer_kern) static bool __is_kfunc_ptr_arg_type(const struct btf *btf, const struct btf_param *arg, int type) @@ -10877,6 +10879,12 @@ static bool is_kfunc_arg_rbtree_node(const struct btf *btf, const struct btf_par return __is_kfunc_ptr_arg_type(btf, arg, KF_ARG_RB_NODE_ID); } +static bool is_kfunc_arg_timer(const struct btf *btf, const struct btf_param *arg) +{ + bool ret = __is_kfunc_ptr_arg_type(btf, arg, KF_ARG_TIMER_ID); + return ret; +} + static bool is_kfunc_arg_callback(struct bpf_verifier_env *env, const struct btf *btf, const struct btf_param *arg) { @@ -10946,6 +10954,7 @@ enum kfunc_ptr_arg_type { KF_ARG_PTR_TO_NULL, KF_ARG_PTR_TO_CONST_STR, KF_ARG_PTR_TO_MAP, + KF_ARG_PTR_TO_TIMER, }; enum special_kfunc_type { @@ -11102,6 +11111,9 @@ get_kfunc_ptr_arg_type(struct bpf_verifier_env *env, if (is_kfunc_arg_map(meta->btf, &args[argno])) return KF_ARG_PTR_TO_MAP; + if (is_kfunc_arg_timer(meta->btf, &args[argno])) + return KF_ARG_PTR_TO_TIMER; + if ((base_type(reg->type) == PTR_TO_BTF_ID || reg2btf_ids[base_type(reg->type)])) { if (!btf_type_is_struct(ref_t)) { verbose(env, "kernel function %s args#%d pointer type %s %s is not supported\n", @@ -11735,6 +11747,7 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_ case KF_ARG_PTR_TO_CALLBACK: case KF_ARG_PTR_TO_REFCOUNTED_KPTR: case KF_ARG_PTR_TO_CONST_STR: + case KF_ARG_PTR_TO_TIMER: /* Trusted by default */ break; default: @@ -12021,6 +12034,12 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_ if (ret) return ret; break; + case KF_ARG_PTR_TO_TIMER: + if (reg->type != PTR_TO_MAP_VALUE) { + verbose(env, "arg#%d doesn't point to a map value\n", i); + return -EINVAL; + } + break; } } From patchwork Fri Mar 15 14:29:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Tissoires X-Patchwork-Id: 13593533 X-Patchwork-Delegate: bpf@iogearbox.net Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3D2DA1A38DA; Fri, 15 Mar 2024 14:29:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710512990; cv=none; b=Gr5Nc0I0gInY9ggdWOSaXX+XmHnQfywlc4gQFFmOuEpklpBTnnQ62nms7EF9ILECAYoTdjIHqC/cklNtqXkEV5OqkOc9c+6ND6xMUCxJydsPNLlmYfBoQjpJtbJH5cimuSLUrG4nXUobWnz+F4fDpmVX1yKKHxVmZPUHUs1gelg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710512990; c=relaxed/simple; bh=mhMdeACyX+6Yd0tdIMaKx9rqz9mZiZuvzu0h8SK/e+U=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=c9KqPaSVBzYeE2rCrjQaD3YLkX2dj8o5uIPGBwJtpyAsvf5VaSOaQADibx/CfK0bFJ1roZ43NQPIFTdYfuzBdNVUadJpmHuBm7/2jcJ9tcCkkcbigF+GfIbKzQ9p8NypGO1GSUCj9jcTdiUKOx8cgJslik2ak9b7WkR1kTlb8RM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=h8xPRcjm; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="h8xPRcjm" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2DC83C43390; Fri, 15 Mar 2024 14:29:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1710512989; bh=mhMdeACyX+6Yd0tdIMaKx9rqz9mZiZuvzu0h8SK/e+U=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=h8xPRcjmD6w9Q4FTtSgjR55pdg/H5zSGkN60he6sRCFhrlnD29+J2Qi3RDRRpD1Va 2NG6sykVtBlqMOfv/w33Z6IPEWGPZPJk0lTswzIOTIJ+0WIpVEzZZE08LmM+zJRExa 7xUzMQQu8VBJ+UByxdtCArPb0QOuwMiI1RaMIEEGSCkfcWVxjW78fRD4ZgvLzMlDLo Ya98s1YexYOf2ntK9AE+Fj9wLKLtG0vVUijcBCRxw8Gi/vFu9NKKaZ3Kwtt6PK0Q1O iXIcEV2WfinjfRBuLN+mWO5MbRn5HfpC/MEWtq1OidP0obiHNe9UdOj3HbCcJ6P1Fd BbWjHDzj2I3DQ== From: Benjamin Tissoires Date: Fri, 15 Mar 2024 15:29:27 +0100 Subject: [PATCH bpf-next v4 3/6] bpf/helpers: introduce bpf_timer_set_sleepable_cb() kfunc Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240315-hid-bpf-sleepable-v4-3-5658f2540564@kernel.org> References: <20240315-hid-bpf-sleepable-v4-0-5658f2540564@kernel.org> In-Reply-To: <20240315-hid-bpf-sleepable-v4-0-5658f2540564@kernel.org> To: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Mykola Lysenko , Shuah Khan Cc: Benjamin Tissoires , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org X-Mailer: b4 0.12.4 X-Developer-Signature: v=1; a=ed25519-sha256; t=1710512973; l=9286; i=bentiss@kernel.org; s=20230215; h=from:subject:message-id; bh=mhMdeACyX+6Yd0tdIMaKx9rqz9mZiZuvzu0h8SK/e+U=; b=mAUntWhTbiJeKaD5XQM1fVnXZgEXxP9c61ZTObRLcI8q+WFEXt8b0S2g0YwaHCr7zLLXCkJie ZuBnwrVL3yrCN1CAz+pZ3bOfh+o5hyCBkJrCHRD8h9ezaN7sYQzLAIr X-Developer-Key: i=bentiss@kernel.org; a=ed25519; pk=7D1DyAVh6ajCkuUTudt/chMuXWIJHlv2qCsRkIizvFw= X-Patchwork-Delegate: bpf@iogearbox.net In this patch, bpf_timer_set_sleepable_cb() is functionally equivalent to bpf_timer_set_callback(), to the exception that it enforces the timer to be started with BPF_F_TIMER_SLEEPABLE. But given that bpf_timer_set_callback() is a helper when bpf_timer_set_sleepable_cb() is a kfunc, we need to teach the verifier about its attached callback. Marking that callback as sleepable will be done in a separate patch Signed-off-by: Benjamin Tissoires Acked-by: Eduard Zingerman --- changes in v4: - added a new (ignored) argument to the kfunc so that we do not need to wlak the stack new in v3 (split from v2 02/10) --- kernel/bpf/helpers.c | 46 +++++++++++++++++++++++++++++++++++++-- kernel/bpf/verifier.c | 60 +++++++++++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 102 insertions(+), 4 deletions(-) diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 38de73a9df83..65c07c0df263 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -1108,6 +1108,7 @@ struct bpf_hrtimer { void *value; struct rcu_head rcu; spinlock_t sleepable_lock; + bool is_sleepable; }; /* the actual struct hidden inside uapi struct bpf_timer */ @@ -1273,8 +1274,8 @@ static const struct bpf_func_proto bpf_timer_init_proto = { .arg3_type = ARG_ANYTHING, }; -BPF_CALL_3(bpf_timer_set_callback, struct bpf_timer_kern *, timer, void *, callback_fn, - struct bpf_prog_aux *, aux) +static int __bpf_timer_set_callback(struct bpf_timer_kern *timer, void *callback_fn, + struct bpf_prog_aux *aux, bool is_sleepable) { struct bpf_prog *prev, *prog = aux->prog; struct bpf_hrtimer *t; @@ -1314,6 +1315,7 @@ BPF_CALL_3(bpf_timer_set_callback, struct bpf_timer_kern *, timer, void *, callb t->prog = prog; } rcu_assign_pointer(t->callback_fn, callback_fn); + t->is_sleepable = is_sleepable; out: if (t) spin_unlock(&t->sleepable_lock); @@ -1321,6 +1323,12 @@ BPF_CALL_3(bpf_timer_set_callback, struct bpf_timer_kern *, timer, void *, callb return ret; } +BPF_CALL_3(bpf_timer_set_callback, struct bpf_timer_kern *, timer, void *, callback_fn, + struct bpf_prog_aux *, aux) +{ + return __bpf_timer_set_callback(timer, callback_fn, aux, false); +} + static const struct bpf_func_proto bpf_timer_set_callback_proto = { .func = bpf_timer_set_callback, .gpl_only = true, @@ -1350,6 +1358,11 @@ BPF_CALL_3(bpf_timer_start, struct bpf_timer_kern *, timer, u64, nsecs, u64, fla goto out; } + if (t->is_sleepable && !(flags & BPF_F_TIMER_SLEEPABLE)) { + ret = -EINVAL; + goto out; + } + if (flags & BPF_F_TIMER_ABS) mode = HRTIMER_MODE_ABS_SOFT; else @@ -2627,6 +2640,34 @@ __bpf_kfunc void bpf_throw(u64 cookie) WARN(1, "A call to BPF exception callback should never return\n"); } +/** + * bpf_timer_set_sleepable_cb_impl() - Configure the timer to call %callback_fn + * static function in a sleepable context. + * @timer: The bpf_timer that needs to be configured + * @callback_fn: a static bpf function + * + * @returns %0 on success. %-EINVAL if %timer was not initialized with + * bpf_timer_init() earlier. %-EPERM if %timer is in a map that doesn't + * have any user references. + * The user space should either hold a file descriptor to a map with timers + * or pin such map in bpffs. When map is unpinned or file descriptor is + * closed all timers in the map will be cancelled and freed. + * + * This kfunc is equivalent to %bpf_timer_set_callback except that it tells + * the verifier that the target callback is run in a sleepable context. + */ +__bpf_kfunc int bpf_timer_set_sleepable_cb_impl(struct bpf_timer_kern *timer, + int (callback_fn)(void *map, int *key, struct bpf_timer *timer), + void *aux__ign) +{ + struct bpf_prog_aux *aux = (struct bpf_prog_aux *)aux__ign; + + if (!aux) + return -EINVAL; + + return __bpf_timer_set_callback(timer, (void *)callback_fn, aux, true); +} + __bpf_kfunc_end_defs(); BTF_KFUNCS_START(generic_btf_ids) @@ -2703,6 +2744,7 @@ BTF_ID_FLAGS(func, bpf_dynptr_is_null) BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly) BTF_ID_FLAGS(func, bpf_dynptr_size) BTF_ID_FLAGS(func, bpf_dynptr_clone) +BTF_ID_FLAGS(func, bpf_timer_set_sleepable_cb_impl) BTF_KFUNCS_END(common_btf_ids) static const struct btf_kfunc_id_set common_kfunc_set = { diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 1483ebc0ee73..53f85e114a33 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -501,8 +501,12 @@ static bool is_dynptr_ref_function(enum bpf_func_id func_id) } static bool is_sync_callback_calling_kfunc(u32 btf_id); +static bool is_async_callback_calling_kfunc(u32 btf_id); +static bool is_callback_calling_kfunc(u32 btf_id); static bool is_bpf_throw_kfunc(struct bpf_insn *insn); +static bool is_bpf_timer_set_sleepable_cb_impl_kfunc(u32 btf_id); + static bool is_sync_callback_calling_function(enum bpf_func_id func_id) { return func_id == BPF_FUNC_for_each_map_elem || @@ -530,7 +534,8 @@ static bool is_sync_callback_calling_insn(struct bpf_insn *insn) static bool is_async_callback_calling_insn(struct bpf_insn *insn) { - return bpf_helper_call(insn) && is_async_callback_calling_function(insn->imm); + return (bpf_helper_call(insn) && is_async_callback_calling_function(insn->imm)) || + (bpf_pseudo_kfunc_call(insn) && is_async_callback_calling_kfunc(insn->imm)); } static bool is_may_goto_insn(struct bpf_insn *insn) @@ -9471,7 +9476,7 @@ static int push_callback_call(struct bpf_verifier_env *env, struct bpf_insn *ins */ env->subprog_info[subprog].is_cb = true; if (bpf_pseudo_kfunc_call(insn) && - !is_sync_callback_calling_kfunc(insn->imm)) { + !is_callback_calling_kfunc(insn->imm)) { verbose(env, "verifier bug: kfunc %s#%d not marked as callback-calling\n", func_id_name(insn->imm), insn->imm); return -EFAULT; @@ -10981,6 +10986,7 @@ enum special_kfunc_type { KF_bpf_percpu_obj_drop_impl, KF_bpf_throw, KF_bpf_iter_css_task_new, + KF_bpf_timer_set_sleepable_cb_impl, }; BTF_SET_START(special_kfunc_set) @@ -11007,6 +11013,7 @@ BTF_ID(func, bpf_throw) #ifdef CONFIG_CGROUPS BTF_ID(func, bpf_iter_css_task_new) #endif +BTF_ID(func, bpf_timer_set_sleepable_cb_impl) BTF_SET_END(special_kfunc_set) BTF_ID_LIST(special_kfunc_list) @@ -11037,6 +11044,7 @@ BTF_ID(func, bpf_iter_css_task_new) #else BTF_ID_UNUSED #endif +BTF_ID(func, bpf_timer_set_sleepable_cb_impl) static bool is_kfunc_ret_null(struct bpf_kfunc_call_arg_meta *meta) { @@ -11365,12 +11373,28 @@ static bool is_sync_callback_calling_kfunc(u32 btf_id) return btf_id == special_kfunc_list[KF_bpf_rbtree_add_impl]; } +static bool is_async_callback_calling_kfunc(u32 btf_id) +{ + return btf_id == special_kfunc_list[KF_bpf_timer_set_sleepable_cb_impl]; +} + static bool is_bpf_throw_kfunc(struct bpf_insn *insn) { return bpf_pseudo_kfunc_call(insn) && insn->off == 0 && insn->imm == special_kfunc_list[KF_bpf_throw]; } +static bool is_bpf_timer_set_sleepable_cb_impl_kfunc(u32 btf_id) +{ + return btf_id == special_kfunc_list[KF_bpf_timer_set_sleepable_cb_impl]; +} + +static bool is_callback_calling_kfunc(u32 btf_id) +{ + return is_sync_callback_calling_kfunc(btf_id) || + is_async_callback_calling_kfunc(btf_id); +} + static bool is_rbtree_lock_required_kfunc(u32 btf_id) { return is_bpf_rbtree_api_kfunc(btf_id); @@ -12151,6 +12175,16 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, } } + if (is_async_callback_calling_kfunc(meta.func_id)) { + err = push_callback_call(env, insn, insn_idx, meta.subprogno, + set_timer_callback_state); + if (err) { + verbose(env, "kfunc %s#%d failed callback verification\n", + func_name, meta.func_id); + return err; + } + } + rcu_lock = is_kfunc_bpf_rcu_read_lock(&meta); rcu_unlock = is_kfunc_bpf_rcu_read_unlock(&meta); @@ -19544,6 +19578,28 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, desc->func_id == special_kfunc_list[KF_bpf_rdonly_cast]) { insn_buf[0] = BPF_MOV64_REG(BPF_REG_0, BPF_REG_1); *cnt = 1; + } else if (is_bpf_timer_set_sleepable_cb_impl_kfunc(desc->func_id)) { + /* The verifier will process callback_fn as many times as necessary + * with different maps and the register states prepared by + * set_timer_callback_state will be accurate. + * + * The following use case is valid: + * map1 is shared by prog1, prog2, prog3. + * prog1 calls bpf_timer_init for some map1 elements + * prog2 calls bpf_timer_set_callback for some map1 elements. + * Those that were not bpf_timer_init-ed will return -EINVAL. + * prog3 calls bpf_timer_start for some map1 elements. + * Those that were not both bpf_timer_init-ed and + * bpf_timer_set_callback-ed will return -EINVAL. + */ + struct bpf_insn ld_addrs[2] = { + BPF_LD_IMM64(BPF_REG_3, (long)env->prog->aux), + }; + + insn_buf[0] = ld_addrs[0]; + insn_buf[1] = ld_addrs[1]; + insn_buf[2] = *insn; + *cnt = 3; } return 0; } From patchwork Fri Mar 15 14:29:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Tissoires X-Patchwork-Id: 13593534 X-Patchwork-Delegate: bpf@iogearbox.net Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 025303D568; Fri, 15 Mar 2024 14:29:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710512994; cv=none; b=hSWQanHIWipcM0z1W0VpZWNgMA77tx8cf8eD2Z1bEb1HwS3A+UhNO10lBXZnmTV91WQZbkFEyoBSUk6RtLRqrV0son/gpKLBxcnIQoVYeCOZ6z/XwqeUbxTflVfM+F0TqRclY19NYQlw1aahgT7JujTo4rQ3tA2v+tRAijzgxW8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710512994; c=relaxed/simple; bh=It8QhO/TJYMs3p45CGdPHhS/LCw5f+FRkEolyZh0LiE=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=tLb3hpATTPplGhyCTd5F3lokGHx3Fefbm0VNaKbjh5GezH6y6tAeMyGOQ7QL4pFti4zVn6wUik5LyFTRJUXLHKhpS05e3O5MYhYI/jNtS73LLbubasC0LWkasVtMu7lk42Bs0hSLmHLUyhFidmgYCLCi9CONSBjJWqsvsKpmfPc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=C+M/Lpr6; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="C+M/Lpr6" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3F1D7C433C7; Fri, 15 Mar 2024 14:29:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1710512993; bh=It8QhO/TJYMs3p45CGdPHhS/LCw5f+FRkEolyZh0LiE=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=C+M/Lpr6aQ0rhBrvS6BEe/OTqXK+nICVH7BYMGl9VlpDPFKnaOXjdEyCz7ISYwqhU G5KoB2sDm5RVJ0ujiM8Kff9YPojuqe4e5vISJXRbAse8kNX1j+ISBFE2pJ+gjfiYw8 tNkvcuYYjUsX582i5asTZz1oKZRKJT0r+orejPLdehVfjPvZ/zcdkww5d6TnCRVB2W 3qWzC+g/LutOBmb//LIkBkYGVGZcTS8efRs56L1JmAp/sK7BLxdDXpgsKocu7EKryu 2auIWNOTy1eMb+ZcZJYEoo5bL5FGdwYbZowH+DNU81QvRJit1YJuNjN6cdxPkKsDEJ H5gIKfa62CS/g== From: Benjamin Tissoires Date: Fri, 15 Mar 2024 15:29:28 +0100 Subject: [PATCH bpf-next v4 4/6] bpf/helpers: mark the callback of bpf_timer_set_sleepable_cb() as sleepable Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240315-hid-bpf-sleepable-v4-4-5658f2540564@kernel.org> References: <20240315-hid-bpf-sleepable-v4-0-5658f2540564@kernel.org> In-Reply-To: <20240315-hid-bpf-sleepable-v4-0-5658f2540564@kernel.org> To: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Mykola Lysenko , Shuah Khan Cc: Benjamin Tissoires , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org X-Mailer: b4 0.12.4 X-Developer-Signature: v=1; a=ed25519-sha256; t=1710512973; l=3385; i=bentiss@kernel.org; s=20230215; h=from:subject:message-id; bh=It8QhO/TJYMs3p45CGdPHhS/LCw5f+FRkEolyZh0LiE=; b=MNhR/0WSjqOhu+ACtxDboy3ArThXXxuZ4atR5UYxJbcjKCgyV0j9RKaqwO1UL8W5T7v6ZHl7p LKR/gSQ1M8FA4QON57mub1MaoC2Ig8GvpVCn2y/y1MLi1VYAWeadVUD X-Developer-Key: i=bentiss@kernel.org; a=ed25519; pk=7D1DyAVh6ajCkuUTudt/chMuXWIJHlv2qCsRkIizvFw= X-Patchwork-Delegate: bpf@iogearbox.net Now that we have bpf_timer_set_sleepable_cb() available and working, we can tag the attached callback as sleepable, and let the verifier check in the correct context the calls and kfuncs. Signed-off-by: Benjamin Tissoires --- changes in v4: - use a function parameter to forward the sleepable information new in v3 (split from v2 02/10) --- include/linux/bpf_verifier.h | 1 + kernel/bpf/verifier.c | 13 ++++++++++--- 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 7cb1b75eee38..14e4ee67b694 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -426,6 +426,7 @@ struct bpf_verifier_state { * while they are still in use. */ bool used_as_loop_entry; + bool in_sleepable; /* first and last insn idx of this verifier state */ u32 first_insn_idx; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 53f85e114a33..0be07da38f8a 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1434,6 +1434,7 @@ static int copy_verifier_state(struct bpf_verifier_state *dst_state, } dst_state->speculative = src->speculative; dst_state->active_rcu_lock = src->active_rcu_lock; + dst_state->in_sleepable = src->in_sleepable; dst_state->curframe = src->curframe; dst_state->active_lock.ptr = src->active_lock.ptr; dst_state->active_lock.id = src->active_lock.id; @@ -2407,7 +2408,7 @@ static void init_func_state(struct bpf_verifier_env *env, /* Similar to push_stack(), but for async callbacks */ static struct bpf_verifier_state *push_async_cb(struct bpf_verifier_env *env, int insn_idx, int prev_insn_idx, - int subprog) + int subprog, bool is_sleepable) { struct bpf_verifier_stack_elem *elem; struct bpf_func_state *frame; @@ -2434,6 +2435,7 @@ static struct bpf_verifier_state *push_async_cb(struct bpf_verifier_env *env, * Initialize it similar to do_check_common(). */ elem->st.branches = 1; + elem->st.in_sleepable = is_sleepable; frame = kzalloc(sizeof(*frame), GFP_KERNEL); if (!frame) goto err; @@ -5279,7 +5281,8 @@ static int map_kptr_match_type(struct bpf_verifier_env *env, static bool in_sleepable(struct bpf_verifier_env *env) { - return env->prog->sleepable; + return env->prog->sleepable || + (env->cur_state && env->cur_state->in_sleepable); } /* The non-sleepable programs and sleepable programs with explicit bpf_rcu_read_lock() @@ -9493,7 +9496,8 @@ static int push_callback_call(struct bpf_verifier_env *env, struct bpf_insn *ins /* there is no real recursion here. timer callbacks are async */ env->subprog_info[subprog].is_async_cb = true; async_cb = push_async_cb(env, env->subprog_info[subprog].start, - insn_idx, subprog); + insn_idx, subprog, + is_bpf_timer_set_sleepable_cb_impl_kfunc(insn->imm)); if (!async_cb) return -EFAULT; callee = async_cb->frame[0]; @@ -16937,6 +16941,9 @@ static bool states_equal(struct bpf_verifier_env *env, if (old->active_rcu_lock != cur->active_rcu_lock) return false; + if (old->in_sleepable != cur->in_sleepable) + return false; + /* for states to be equal callsites have to be the same * and all frame states need to be equivalent */ From patchwork Fri Mar 15 14:29:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Tissoires X-Patchwork-Id: 13593535 X-Patchwork-Delegate: bpf@iogearbox.net Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5F4803F9EA; Fri, 15 Mar 2024 14:29:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710512998; cv=none; b=SMQ35RTyEoskgVd6Ru38JTYk/bIuztT7KMZDRLYGAJ5GtaqZaBls+tP7RBGZSb2s0H7TJ+eayG6DUf1CXce+WT2LotKIZWOkxY61Dife9u8XYnF2pai23DLyCl1EmzsWdssYhLaUy4dRnnySqSH/yXB0aT8VHBRSv3oryi+nOaI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710512998; c=relaxed/simple; bh=2fcfnf6IWojwAjkJVFYBpNts+ogJLinSFTftkDm2+Ew=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=SSkB9bIUHu9GiDn1qxa3XgU1O7XGLXB4hBlK6zTgtFGV+X7YnEWfmbrbW0mF6xJ1P44lnB+fp1pCzkXlWzv+ljgNALJwITP+GZvVx6MYQ0gcQty2pBt/MhZnwSB42rTsPYTcIle7Ln/WX5nxspKKdgEigzOaLlOC52WTbPlO48Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=WaqiYl6Y; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="WaqiYl6Y" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4FAF3C433B1; Fri, 15 Mar 2024 14:29:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1710512997; bh=2fcfnf6IWojwAjkJVFYBpNts+ogJLinSFTftkDm2+Ew=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=WaqiYl6Yqq7REg+mKKkoYi/+rpaQWm5Po0vqvzJpOU+vTan2XArvvK+yDDjS82Pf9 pICo9wJHpTSVtJAEfxr4GBWEc2Gs+Qh6FKP+egbF1QHzrTe7eKHi15KzWPZRQ4j61N r2kvXQeKTfQGgQMF9MedgnbLZIfmhI9Yzu8BRrsc6cvZkxy4UbG0HNzv1by71oTrJ2 s6hU7PK9SBxMeposPK3Be+nJ05DY2C4OHd6TvHzq+SghFF7NHSgdtLxOoPve2iEPjf qGxUrwAhnZ9Q0nErYn5YnhjFElFjnrgYAWp8Hsz8SM2uhrT8GlqeVu95VdnC1b37pL eeKWiWO40ADsA== From: Benjamin Tissoires Date: Fri, 15 Mar 2024 15:29:29 +0100 Subject: [PATCH bpf-next v4 5/6] tools: sync include/uapi/linux/bpf.h Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240315-hid-bpf-sleepable-v4-5-5658f2540564@kernel.org> References: <20240315-hid-bpf-sleepable-v4-0-5658f2540564@kernel.org> In-Reply-To: <20240315-hid-bpf-sleepable-v4-0-5658f2540564@kernel.org> To: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Mykola Lysenko , Shuah Khan Cc: Benjamin Tissoires , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org X-Mailer: b4 0.12.4 X-Developer-Signature: v=1; a=ed25519-sha256; t=1710512973; l=1033; i=bentiss@kernel.org; s=20230215; h=from:subject:message-id; bh=2fcfnf6IWojwAjkJVFYBpNts+ogJLinSFTftkDm2+Ew=; b=p7+eATGcycPy4c3F3tCschh/kT0RC5Se8rKv4lTXppzZubEWWRz0eQVIWswgZgtW6byCj123X WlCOau3vZOSBoY6sUU3REFN4QCmjCEHZ3ruyD9e5HI4YyWM9NI7VE6N X-Developer-Key: i=bentiss@kernel.org; a=ed25519; pk=7D1DyAVh6ajCkuUTudt/chMuXWIJHlv2qCsRkIizvFw= X-Patchwork-Delegate: bpf@iogearbox.net cp include/uapi/linux/bpf.h tools/include/uapi/linux/bpf.h Signed-off-by: Benjamin Tissoires --- new in v4 --- tools/include/uapi/linux/bpf.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 3c42b9f1bada..b90def29d796 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -7461,10 +7461,14 @@ struct bpf_core_relo { * - BPF_F_TIMER_ABS: Timeout passed is absolute time, by default it is * relative to current time. * - BPF_F_TIMER_CPU_PIN: Timer will be pinned to the CPU of the caller. + * - BPF_F_TIMER_SLEEPABLE: Timer will run in a sleepable context, with + * no guarantees of ordering nor timing (consider this as being just + * offloaded immediately). */ enum { BPF_F_TIMER_ABS = (1ULL << 0), BPF_F_TIMER_CPU_PIN = (1ULL << 1), + BPF_F_TIMER_SLEEPABLE = (1ULL << 2), }; /* BPF numbers iterator state */ From patchwork Fri Mar 15 14:29:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Tissoires X-Patchwork-Id: 13593536 X-Patchwork-Delegate: bpf@iogearbox.net Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7FBB341740; Fri, 15 Mar 2024 14:30:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710513002; cv=none; b=TL9UvH6cUGgK2C2DNtjI9V386cOJSuLVgyWqfWtLdMmWKzhcFsCt79NOlTV6CMHiFLLnFaBzodTEQszWTcggtvEUeEspvVuOHOAFxcBDrvV/5u1+61Dl2K5q40As+ayARFDlf7j2eTOPXrP8I4o8PkorOUbcNepE6EORDwcilJM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710513002; c=relaxed/simple; bh=26Me8+u5TzyDpIxITOUYsHiBNwg5FLbt6q/Tou7u72Q=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=tog7/pFepVxv4j0gimfgsliN3zjkNZv92MJNmTvnAG46uBdi+cbAmUk2Ep3Mya41OwdFBVOMNxZsM7hHUwPcYX1eFFsJE36+7gzoKT578QrzRlmwRE0gWL9qofwqsbdVm2tfGVgmmdipa/0KcJR8wAcTIAuB3ksRGnaMPRIgNAM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=kso+9RAZ; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="kso+9RAZ" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5FA4BC43330; Fri, 15 Mar 2024 14:29:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1710513002; bh=26Me8+u5TzyDpIxITOUYsHiBNwg5FLbt6q/Tou7u72Q=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=kso+9RAZ5X5Z1AWvsIi4i99IuUUYJDenBgyzwBD5iR9xFu30mLzd0PYT62JprkepT pGvPe1OJ3IoGtedVvfX76VYaex3dhEgvngIyoAsJoVBW1rysVUDFtWnKL/WFHyuPo1 aa9w8n97ViPGyLqTGRhTz4m8f8ThY2hjZPHjJ1Jifv81tr1cuqOS1VK1IsdOkckWpC bMg4CpbrmIqUvYmJNofqaziOcK92/r/hGhDTlXdfDKbSSFe5pKqSKhrNU7ZUpUmgDB QXAhouM5wFJtivBToQ27kfQojGhLrgTEO/pwgWIhMOWjDR/6uml72nyjm0t40Ddyoo m7MJu6t4SmyBg== From: Benjamin Tissoires Date: Fri, 15 Mar 2024 15:29:30 +0100 Subject: [PATCH bpf-next v4 6/6] selftests/bpf: add sleepable timer tests Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240315-hid-bpf-sleepable-v4-6-5658f2540564@kernel.org> References: <20240315-hid-bpf-sleepable-v4-0-5658f2540564@kernel.org> In-Reply-To: <20240315-hid-bpf-sleepable-v4-0-5658f2540564@kernel.org> To: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Mykola Lysenko , Shuah Khan Cc: Benjamin Tissoires , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org X-Mailer: b4 0.12.4 X-Developer-Signature: v=1; a=ed25519-sha256; t=1710512973; l=9214; i=bentiss@kernel.org; s=20230215; h=from:subject:message-id; bh=26Me8+u5TzyDpIxITOUYsHiBNwg5FLbt6q/Tou7u72Q=; b=E0/FlxTG5Yp8VKPH3kmITNMcIJl2imJb8feolcUlHhOM8Zvz4zXcdD5RWxS0q40lnRzCHZugH +WAZiBORCcYCynnR2MAlyTk/tHxKhfkngZp6HYlbhN2/lTC/gURqET4 X-Developer-Key: i=bentiss@kernel.org; a=ed25519; pk=7D1DyAVh6ajCkuUTudt/chMuXWIJHlv2qCsRkIizvFw= X-Patchwork-Delegate: bpf@iogearbox.net bpf_experimental.h and ../bpf_testmod/bpf_testmod_kfunc.h are both including vmlinux.h, which is not compatible with including time.h or bpf_tcp_helpers.h. So prevent vmlinux.h to be included, and override the few missing types. Signed-off-by: Benjamin Tissoires --- new in v4 --- tools/testing/selftests/bpf/bpf_experimental.h | 4 + .../selftests/bpf/bpf_testmod/bpf_testmod.c | 5 + .../selftests/bpf/bpf_testmod/bpf_testmod_kfunc.h | 1 + tools/testing/selftests/bpf/prog_tests/timer.c | 1 + tools/testing/selftests/bpf/progs/timer.c | 40 +++++++- tools/testing/selftests/bpf/progs/timer_failure.c | 114 ++++++++++++++++++++- 6 files changed, 163 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/bpf/bpf_experimental.h b/tools/testing/selftests/bpf/bpf_experimental.h index a5b9df38c162..79da06ca4136 100644 --- a/tools/testing/selftests/bpf/bpf_experimental.h +++ b/tools/testing/selftests/bpf/bpf_experimental.h @@ -459,4 +459,8 @@ extern int bpf_iter_css_new(struct bpf_iter_css *it, extern struct cgroup_subsys_state *bpf_iter_css_next(struct bpf_iter_css *it) __weak __ksym; extern void bpf_iter_css_destroy(struct bpf_iter_css *it) __weak __ksym; +extern int bpf_timer_set_sleepable_cb_impl(struct bpf_timer *timer, + int (callback_fn)(void *map, int *key, struct bpf_timer *timer), void *aux__ign) __ksym; +#define bpf_timer_set_sleepable_cb(timer, cb) \ + bpf_timer_set_sleepable_cb_impl(timer, cb, NULL) #endif diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c index 39ad96a18123..eb2b78552ca2 100644 --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c @@ -494,6 +494,10 @@ __bpf_kfunc static u32 bpf_kfunc_call_test_static_unused_arg(u32 arg, u32 unused return arg; } +__bpf_kfunc void bpf_kfunc_call_test_sleepable(void) +{ +} + BTF_KFUNCS_START(bpf_testmod_check_kfunc_ids) BTF_ID_FLAGS(func, bpf_testmod_test_mod_kfunc) BTF_ID_FLAGS(func, bpf_kfunc_call_test1) @@ -520,6 +524,7 @@ BTF_ID_FLAGS(func, bpf_kfunc_call_test_ref, KF_TRUSTED_ARGS | KF_RCU) BTF_ID_FLAGS(func, bpf_kfunc_call_test_destructive, KF_DESTRUCTIVE) BTF_ID_FLAGS(func, bpf_kfunc_call_test_static_unused_arg) BTF_ID_FLAGS(func, bpf_kfunc_call_test_offset) +BTF_ID_FLAGS(func, bpf_kfunc_call_test_sleepable, KF_SLEEPABLE) BTF_KFUNCS_END(bpf_testmod_check_kfunc_ids) static int bpf_testmod_ops_init(struct btf *btf) diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod_kfunc.h b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod_kfunc.h index 7c664dd61059..ce5cd763561c 100644 --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod_kfunc.h +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod_kfunc.h @@ -96,6 +96,7 @@ void bpf_kfunc_call_test_pass2(struct prog_test_pass2 *p) __ksym; void bpf_kfunc_call_test_mem_len_fail2(__u64 *mem, int len) __ksym; void bpf_kfunc_call_test_destructive(void) __ksym; +void bpf_kfunc_call_test_sleepable(void) __ksym; void bpf_kfunc_call_test_offset(struct prog_test_ref_kfunc *p); struct prog_test_member *bpf_kfunc_call_memb_acquire(void); diff --git a/tools/testing/selftests/bpf/prog_tests/timer.c b/tools/testing/selftests/bpf/prog_tests/timer.c index d66687f1ee6a..48973c2e28c7 100644 --- a/tools/testing/selftests/bpf/prog_tests/timer.c +++ b/tools/testing/selftests/bpf/prog_tests/timer.c @@ -61,6 +61,7 @@ static int timer(struct timer *timer_skel) /* check that code paths completed */ ASSERT_EQ(timer_skel->bss->ok, 1 | 2 | 4, "ok"); + ASSERT_EQ(timer_skel->bss->ok_sleepable, 1, "ok_sleepable"); prog_fd = bpf_program__fd(timer_skel->progs.race); for (i = 0; i < NUM_THR; i++) { diff --git a/tools/testing/selftests/bpf/progs/timer.c b/tools/testing/selftests/bpf/progs/timer.c index f615da97df26..6b19254c5b75 100644 --- a/tools/testing/selftests/bpf/progs/timer.c +++ b/tools/testing/selftests/bpf/progs/timer.c @@ -6,6 +6,14 @@ #include #include "bpf_tcp_helpers.h" +#define __VMLINUX_H__ +#define u32 __u32 +#define u64 __u64 +#include "bpf_experimental.h" +struct prog_test_member1; +#include "../bpf_testmod/bpf_testmod_kfunc.h" +#undef __VMLINUX_H__ + char _license[] SEC("license") = "GPL"; struct hmap_elem { int counter; @@ -34,7 +42,7 @@ struct elem { struct { __uint(type, BPF_MAP_TYPE_ARRAY); - __uint(max_entries, 2); + __uint(max_entries, 3); __type(key, int); __type(value, struct elem); } array SEC(".maps"); @@ -62,6 +70,7 @@ __u64 callback_check = 52; __u64 callback2_check = 52; __u64 pinned_callback_check; __s32 pinned_cpu; +__u32 ok_sleepable; #define ARRAY 1 #define HTAB 2 @@ -422,3 +431,32 @@ int race(void *ctx) return 0; } + +/* callback for sleepable timer */ +static int timer_cb_sleepable(void *map, int *key, struct bpf_timer *timer) +{ + bpf_kfunc_call_test_sleepable(); + ok_sleepable |= 1; + return 0; +} + +SEC("fentry/bpf_fentry_test6") +int BPF_PROG2(test6, int, a) +{ + int key = 2; + struct bpf_timer *timer; + + bpf_printk("test6"); + + timer = bpf_map_lookup_elem(&array, &key); + if (timer) { + if (bpf_timer_init(timer, &array, CLOCK_MONOTONIC) != 0) + err |= 32768; + bpf_timer_set_sleepable_cb(timer, timer_cb_sleepable); + bpf_timer_start(timer, 0, BPF_F_TIMER_SLEEPABLE); + } else { + err |= 65536; + } + + return 0; +} diff --git a/tools/testing/selftests/bpf/progs/timer_failure.c b/tools/testing/selftests/bpf/progs/timer_failure.c index 0996c2486f05..72942a90189b 100644 --- a/tools/testing/selftests/bpf/progs/timer_failure.c +++ b/tools/testing/selftests/bpf/progs/timer_failure.c @@ -8,6 +8,14 @@ #include "bpf_misc.h" #include "bpf_tcp_helpers.h" +#define __VMLINUX_H__ +#define u32 __u32 +#define u64 __u64 +#include "bpf_experimental.h" +struct prog_test_member1; +#include "../bpf_testmod/bpf_testmod_kfunc.h" +#undef __VMLINUX_H__ + char _license[] SEC("license") = "GPL"; struct elem { @@ -16,7 +24,7 @@ struct elem { struct { __uint(type, BPF_MAP_TYPE_ARRAY); - __uint(max_entries, 1); + __uint(max_entries, 2); __type(key, int); __type(value, struct elem); } timer_map SEC(".maps"); @@ -66,3 +74,107 @@ long BPF_PROG2(test_bad_ret, int, a) return 0; } + +/* callback for sleepable timer */ +static int timer_cb_sleepable(void *map, int *key, struct bpf_timer *timer) +{ + bpf_kfunc_call_test_sleepable(); + return 0; +} + +SEC("fentry/bpf_fentry_test1") +__log_level(2) +__failure +/* check that bpf_timer_set_callback() can not be called with a + * sleepable callback + */ +__msg("mark_precise: frame0: regs=r0 stack= before") +__msg(": (85) call bpf_kfunc_call_test_sleepable#") /* anchor message */ +__msg("program must be sleepable to call sleepable kfunc bpf_kfunc_call_test_sleepable") +int BPF_PROG2(test_non_sleepable_sleepable_callback, int, a) +{ + int key = 0; + struct bpf_timer *timer; + + timer = bpf_map_lookup_elem(&timer_map, &key); + if (timer) { + bpf_timer_init(timer, &timer_map, CLOCK_MONOTONIC); + bpf_timer_set_callback(timer, timer_cb_sleepable); + bpf_timer_start(timer, 0, BPF_F_TIMER_SLEEPABLE); + } + + return 0; +} + +SEC("tc") +/* check that calling bpf_timer_start() without BPF_F_TIMER_SLEEPABLE on a sleepable + * callback is returning -EINVAL + */ +__retval(-22) +long test_call_sleepable_missing_flag(void *ctx) +{ + int key = 0; + struct bpf_timer *timer; + + timer = bpf_map_lookup_elem(&timer_map, &key); + if (!timer) + return 1; + + if (bpf_timer_init(timer, &timer_map, CLOCK_MONOTONIC)) + return 2; + + if (bpf_timer_set_sleepable_cb(timer, timer_cb_sleepable)) + return 3; + + return bpf_timer_start(timer, 0, 0); +} + +SEC("tc") +/* check that calling bpf_timer_start() without BPF_F_TIMER_SLEEPABLE on a sleepable + * callback is returning -EINVAL + */ +__retval(-22) +long test_call_sleepable_delay(void *ctx) +{ + int key = 1; + struct bpf_timer *timer; + + timer = bpf_map_lookup_elem(&timer_map, &key); + if (!timer) + return 1; + + if (bpf_timer_init(timer, &timer_map, CLOCK_MONOTONIC)) + return 2; + + if (bpf_timer_set_sleepable_cb(timer, timer_cb_sleepable)) + return 3; + + return bpf_timer_start(timer, 1, BPF_F_TIMER_SLEEPABLE); +} + +SEC("tc") +__log_level(2) +__failure +/* check that the first argument of bpf_timer_set_callback() + * is a correct bpf_timer pointer. + */ +__msg("mark_precise: frame0: regs=r1 stack= before") +__msg(": (85) call bpf_timer_set_sleepable_cb_impl#") /* anchor message */ +__msg("arg#0 doesn't point to a map value") +long test_wrong_pointer(void *ctx) +{ + int key = 0; + struct bpf_timer *timer; + + timer = bpf_map_lookup_elem(&timer_map, &key); + if (!timer) + return 1; + + if (bpf_timer_init(timer, &timer_map, CLOCK_MONOTONIC)) + return 2; + + if (bpf_timer_set_sleepable_cb((void *)&timer, timer_cb_sleepable)) + return 3; + + return -22; +}