From patchwork Mon Apr 8 08:09:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Tissoires X-Patchwork-Id: 13620720 X-Patchwork-Delegate: bpf@iogearbox.net Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6CCF6364AE; Mon, 8 Apr 2024 08:09:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712563795; cv=none; b=r5Yrv01FElTRzh5uId4BNdqifhp3s+i4XAcTNijIo36yJ5FFkh2UOdaHW4RMIQqXZdK9Of0f94zGtDSM5BJMiKLtmjDPkcSffH3KbJAlnNWYR0hTs5zDCQHzHFVy+EkYN77ZOXFTJ84RJOXSuh4tRKKtE4qhkU/ZPEsvtp2OxEA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712563795; c=relaxed/simple; bh=VANyjQaTmLoOHEbU+/Mh4TaRphGOd6knnqKxEaiLGlY=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=R1c+e/xOiQVBXMX2J4am/cYRDIaAA7+McnfGXJAZybs0X/W2liKEvotTaXF2FUg/ckV6PvwNlYJ0bNolG85R3ysPBp5jWA8AMN1Okey8aR+F/K50++N+Vv5/gD3AKQ5zDMYQR9uC3Z2WLlIDTy08B7LxX2IsULHqaVMPe+KPDiM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=nXN4idVV; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="nXN4idVV" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 919B4C43394; Mon, 8 Apr 2024 08:09:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1712563795; bh=VANyjQaTmLoOHEbU+/Mh4TaRphGOd6knnqKxEaiLGlY=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=nXN4idVVWrGZEqVjAkBeTdPSC8UDXppwvJeOb/Jk6xRpE1aiyX100zJ9O3Fdn1JOi Lw+UJQELHkW8XtSjcIU51q1h4+wyxgHh6mmmfFM94rURIlC9Dvv9qPFFPvUtTtNKw/ EeBq6iYnoh00nFEITY23Vdt8eNumzsiAKByTd3nO8lGe5tyw62R5TyuvJrh+1pgL24 RnbPypE5HdnoM0pv1DegvB0m7xCZ/k2BC+zb4HdrozcJ5+VxUIXYT9vnh6mHlE2LeG 9YsjVt4WG/aktoSn+6SFqMjWNiIzRNiSQ2ippDZblwGZLM+rzgjyFaotBNNi01v5un pPbGtqH3H1rwQ== From: Benjamin Tissoires Date: Mon, 08 Apr 2024 10:09:26 +0200 Subject: [PATCH RFC bpf-next v6 1/6] bpf/helpers: introduce sleepable bpf_timers Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240408-hid-bpf-sleepable-v6-1-0499ddd91b94@kernel.org> References: <20240408-hid-bpf-sleepable-v6-0-0499ddd91b94@kernel.org> In-Reply-To: <20240408-hid-bpf-sleepable-v6-0-0499ddd91b94@kernel.org> To: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Mykola Lysenko , Shuah Khan Cc: Benjamin Tissoires , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org X-Mailer: b4 0.12.4 X-Developer-Signature: v=1; a=ed25519-sha256; t=1712563787; l=11246; i=bentiss@kernel.org; s=20230215; h=from:subject:message-id; bh=VANyjQaTmLoOHEbU+/Mh4TaRphGOd6knnqKxEaiLGlY=; b=oKchrtdzCzSgjD8vA4yeXn7tF/hEtaKKVB6pHPh403b6YHLWPBSMQTFvwW6KHrEsGOEFaHHsK CbT+bAZHGKTDa+FiSZPXXbFZ6nA1MKuNfRAL0PgpUnxovi4l1R/ohL+ X-Developer-Key: i=bentiss@kernel.org; a=ed25519; pk=7D1DyAVh6ajCkuUTudt/chMuXWIJHlv2qCsRkIizvFw= X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC They are implemented as a workqueue, which means that there are no guarantees of timing nor ordering. Signed-off-by: Benjamin Tissoires --- changes in v6: - dropped extra spinlock - implement cancel_and_free of a sleepable timer through a dedicated workqueue no changes in v5 changes in v4: - dropped __bpf_timer_compute_key() - use a spin_lock instead of a semaphore - ensure bpf_timer_cancel_and_free is not complaining about non sleepable context and use cancel_work() instead of cancel_work_sync() - return -EINVAL if a delay is given to bpf_timer_start() with BPF_F_TIMER_SLEEPABLE changes in v3: - extracted the implementation in bpf_timer only, without bpf_timer_set_sleepable_cb() - rely on schedule_work() only, from bpf_timer_start() - add semaphore to ensure bpf_timer_work_cb() is accessing consistent data changes in v2 (compared to the one attaches to v1 0/9): - make use of a kfunc - add a (non-used) BPF_F_TIMER_SLEEPABLE - the callback is *not* called, it makes the kernel crashes --- include/uapi/linux/bpf.h | 13 ++++ kernel/bpf/helpers.c | 159 ++++++++++++++++++++++++++++++++++++++--------- 2 files changed, 143 insertions(+), 29 deletions(-) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 9585f5345353..f1890eed213a 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -7469,6 +7469,19 @@ enum { BPF_F_TIMER_CPU_PIN = (1ULL << 1), }; +/* Extra flags to control bpf_timer_init() behaviour, in addition to CLOCK_*. + * - BPF_F_TIMER_SLEEPABLE: Timer will run in a workqueue in a sleepable + * context, with no guarantees of ordering nor timing (consider this as + * being just offloaded immediately). + */ +enum { + /* CLOCK_* are using bits 0 to 3 */ + BPF_F_TIMER_SLEEPABLE = (1ULL << 4), + __MAX_BPF_F_TIMER_INIT, +}; + +#define MAX_BPF_F_TIMER_INIT __MAX_BPF_F_TIMER_INIT + /* BPF numbers iterator state */ struct bpf_iter_num { /* opaque iterator state; having __u64 here allows to preserve correct diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 9234174ccb21..fd05d4358b31 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -1096,12 +1096,19 @@ const struct bpf_func_proto bpf_snprintf_proto = { * freeing the timers when inner map is replaced or deleted by user space. */ struct bpf_hrtimer { - struct hrtimer timer; + union { + struct hrtimer timer; + struct work_struct work; + }; struct bpf_map *map; struct bpf_prog *prog; void __rcu *callback_fn; void *value; - struct rcu_head rcu; + union { + struct rcu_head rcu; + struct work_struct sync_work; + }; + u64 flags; }; /* the actual struct hidden inside uapi struct bpf_timer */ @@ -1114,6 +1121,58 @@ struct bpf_timer_kern { struct bpf_spin_lock lock; } __attribute__((aligned(8))); +static void bpf_timer_work_cb(struct work_struct *work) +{ + struct bpf_hrtimer *t = container_of(work, struct bpf_hrtimer, work); + struct bpf_tramp_run_ctx __maybe_unused run_ctx; + struct bpf_prog *prog = t->prog; + struct bpf_map *map = t->map; + bpf_callback_t callback_fn; + void *value = t->value; + unsigned long flags; + void *key; + u32 idx; + + BTF_TYPE_EMIT(struct bpf_timer); + + callback_fn = READ_ONCE(t->callback_fn); + if (!callback_fn || !prog) + return; + + if (map->map_type == BPF_MAP_TYPE_ARRAY) { + struct bpf_array *array = container_of(map, struct bpf_array, map); + + /* compute the key */ + idx = ((char *)value - array->value) / array->elem_size; + key = &idx; + } else { /* hash or lru */ + key = value - round_up(map->key_size, 8); + } + + run_ctx.bpf_cookie = 0; + + if (!__bpf_prog_enter_sleepable_recur(prog, &run_ctx)) { + /* recursion detected */ + __bpf_prog_exit_sleepable_recur(prog, 0, &run_ctx); + return; + } + + callback_fn((u64)(long)map, (u64)(long)key, (u64)(long)value, 0, 0); + /* The verifier checked that return value is zero. */ + + __bpf_prog_exit_sleepable_recur(prog, 0 /* bpf_prog_run does runtime stats */, + &run_ctx); +} + +static void bpf_timer_sync_work_cb(struct work_struct *work) +{ + struct bpf_hrtimer *t = container_of(work, struct bpf_hrtimer, sync_work); + + cancel_work_sync(&t->work); + + kfree_rcu(t, rcu); +} + static DEFINE_PER_CPU(struct bpf_hrtimer *, hrtimer_running); static enum hrtimer_restart bpf_timer_cb(struct hrtimer *hrtimer) @@ -1155,10 +1214,14 @@ static enum hrtimer_restart bpf_timer_cb(struct hrtimer *hrtimer) return HRTIMER_NORESTART; } +#define BPF_TIMER_CLOCK_MASK (MAX_CLOCKS - 1) /* 0xf */ +#define BPF_TIMER_FLAGS_MASK GENMASK_ULL(63, 4) + BPF_CALL_3(bpf_timer_init, struct bpf_timer_kern *, timer, struct bpf_map *, map, u64, flags) { - clockid_t clockid = flags & (MAX_CLOCKS - 1); + clockid_t clockid = flags & BPF_TIMER_CLOCK_MASK; + u64 bpf_timer_flags = flags & BPF_TIMER_FLAGS_MASK; struct bpf_hrtimer *t; int ret = 0; @@ -1169,7 +1232,7 @@ BPF_CALL_3(bpf_timer_init, struct bpf_timer_kern *, timer, struct bpf_map *, map if (in_nmi()) return -EOPNOTSUPP; - if (flags >= MAX_CLOCKS || + if (bpf_timer_flags & ~(BPF_F_TIMER_SLEEPABLE) || /* similar to timerfd except _ALARM variants are not supported */ (clockid != CLOCK_MONOTONIC && clockid != CLOCK_REALTIME && @@ -1190,8 +1253,14 @@ BPF_CALL_3(bpf_timer_init, struct bpf_timer_kern *, timer, struct bpf_map *, map t->value = (void *)timer - map->record->timer_off; t->map = map; t->prog = NULL; + t->flags = flags; rcu_assign_pointer(t->callback_fn, NULL); - hrtimer_init(&t->timer, clockid, HRTIMER_MODE_REL_SOFT); + if (flags & BPF_F_TIMER_SLEEPABLE) { + INIT_WORK(&t->work, bpf_timer_work_cb); + INIT_WORK(&t->sync_work, bpf_timer_sync_work_cb); + } else { + hrtimer_init(&t->timer, clockid, HRTIMER_MODE_REL_SOFT); + } t->timer.function = bpf_timer_cb; WRITE_ONCE(timer->timer, t); /* Guarantee the order between timer->timer and map->usercnt. So @@ -1292,6 +1361,11 @@ BPF_CALL_3(bpf_timer_start, struct bpf_timer_kern *, timer, u64, nsecs, u64, fla goto out; } + if ((t->flags & BPF_F_TIMER_SLEEPABLE) && nsecs) { + ret = -EINVAL; + goto out; + } + if (flags & BPF_F_TIMER_ABS) mode = HRTIMER_MODE_ABS_SOFT; else @@ -1300,7 +1374,10 @@ BPF_CALL_3(bpf_timer_start, struct bpf_timer_kern *, timer, u64, nsecs, u64, fla if (flags & BPF_F_TIMER_CPU_PIN) mode |= HRTIMER_MODE_PINNED; - hrtimer_start(&t->timer, ns_to_ktime(nsecs), mode); + if (t->flags & BPF_F_TIMER_SLEEPABLE) + schedule_work(&t->work); + else + hrtimer_start(&t->timer, ns_to_ktime(nsecs), mode); out: __bpf_spin_unlock_irqrestore(&timer->lock); return ret; @@ -1329,6 +1406,7 @@ static void drop_prog_refcnt(struct bpf_hrtimer *t) BPF_CALL_1(bpf_timer_cancel, struct bpf_timer_kern *, timer) { struct bpf_hrtimer *t; + u64 flags = 0; int ret = 0; if (in_nmi()) @@ -1340,6 +1418,7 @@ BPF_CALL_1(bpf_timer_cancel, struct bpf_timer_kern *, timer) ret = -EINVAL; goto out; } + flags = t->flags; if (this_cpu_read(hrtimer_running) == t) { /* If bpf callback_fn is trying to bpf_timer_cancel() * its own timer the hrtimer_cancel() will deadlock @@ -1351,10 +1430,20 @@ BPF_CALL_1(bpf_timer_cancel, struct bpf_timer_kern *, timer) drop_prog_refcnt(t); out: __bpf_spin_unlock_irqrestore(&timer->lock); - /* Cancel the timer and wait for associated callback to finish - * if it was running. - */ - ret = ret ?: hrtimer_cancel(&t->timer); + + if (flags & BPF_F_TIMER_SLEEPABLE) { + /* Cancel the sleepable work, but *do not* wait for + * it to finish if it was running as we might not be in a + * sleepable context + */ + ret = ret ?: cancel_work(&t->work); + } else { + /* Cancel the timer and wait for associated callback to finish + * if it was running. + */ + ret = ret ?: hrtimer_cancel(&t->timer); + } + rcu_read_unlock(); return ret; } @@ -1373,6 +1462,7 @@ void bpf_timer_cancel_and_free(void *val) { struct bpf_timer_kern *timer = val; struct bpf_hrtimer *t; + u64 flags; /* Performance optimization: read timer->timer without lock first. */ if (!READ_ONCE(timer->timer)) @@ -1383,6 +1473,7 @@ void bpf_timer_cancel_and_free(void *val) t = timer->timer; if (!t) goto out; + flags = t->flags; drop_prog_refcnt(t); /* The subsequent bpf_timer_start/cancel() helpers won't be able to use * this timer, since it won't be initialized. @@ -1392,25 +1483,35 @@ void bpf_timer_cancel_and_free(void *val) __bpf_spin_unlock_irqrestore(&timer->lock); if (!t) return; - /* Cancel the timer and wait for callback to complete if it was running. - * If hrtimer_cancel() can be safely called it's safe to call kfree(t) - * right after for both preallocated and non-preallocated maps. - * The timer->timer = NULL was already done and no code path can - * see address 't' anymore. - * - * Check that bpf_map_delete/update_elem() wasn't called from timer - * callback_fn. In such case don't call hrtimer_cancel() (since it will - * deadlock) and don't call hrtimer_try_to_cancel() (since it will just - * return -1). Though callback_fn is still running on this cpu it's - * safe to do kfree(t) because bpf_timer_cb() read everything it needed - * from 't'. The bpf subprog callback_fn won't be able to access 't', - * since timer->timer = NULL was already done. The timer will be - * effectively cancelled because bpf_timer_cb() will return - * HRTIMER_NORESTART. - */ - if (this_cpu_read(hrtimer_running) != t) - hrtimer_cancel(&t->timer); - kfree_rcu(t, rcu); + + if (flags & BPF_F_TIMER_SLEEPABLE) { + /* Trigger cancel of the sleepable work, but *do not* wait for + * it to finish if it was running as we might not be in a + * sleepable context. + * kfree will be called once the work has finished. + */ + schedule_work(&t->sync_work); + } else { + /* Cancel the timer and wait for callback to complete if it was running. + * If hrtimer_cancel() can be safely called it's safe to call kfree(t) + * right after for both preallocated and non-preallocated maps. + * The timer->timer = NULL was already done and no code path can + * see address 't' anymore. + * + * Check that bpf_map_delete/update_elem() wasn't called from timer + * callback_fn. In such case don't call hrtimer_cancel() (since it will + * deadlock) and don't call hrtimer_try_to_cancel() (since it will just + * return -1). Though callback_fn is still running on this cpu it's + * safe to do kfree(t) because bpf_timer_cb() read everything it needed + * from 't'. The bpf subprog callback_fn won't be able to access 't', + * since timer->timer = NULL was already done. The timer will be + * effectively cancelled because bpf_timer_cb() will return + * HRTIMER_NORESTART. + */ + if (this_cpu_read(hrtimer_running) != t) + hrtimer_cancel(&t->timer); + kfree_rcu(t, rcu); + } } BPF_CALL_2(bpf_kptr_xchg, void *, map_value, void *, ptr) From patchwork Mon Apr 8 08:09:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Tissoires X-Patchwork-Id: 13620721 X-Patchwork-Delegate: bpf@iogearbox.net Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 504B2383A0; Mon, 8 Apr 2024 08:09:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712563799; cv=none; b=X61BDpjdSnQ+6W3MiuKmSVuz9TUjaIzLAqYyHWtgzhX/wiXUcaPIaF9yzmzIwEK2jnc/9S5fC8/ghsJrD+zNQ65QO4uh+kdUfat3ENQZnOjEAQf4envcbbti0DqN95uyIxfD3mNDX7cp9Yo8cBdSMj8ozKtLz6z59WH8I4X2kvg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712563799; c=relaxed/simple; bh=JNFc4F167hpSFgsBvPrqtFJbZiF/Fqk7yk6nxtVXHRA=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=ly8FfUjglA/uTv2aCely/+Xebw/CHm26Rknl3aGCEOtCw3WjdD47neVpdpyXSFcinvbIj41YoTRTxkHFy9YVm5zNA8R7iX30yofhHTKqkMacTuCGwFS4BtOzMoCZNZK12TNZFmwxJYu7aEhPcszda26u4IcrazM34/cLs2xuWxI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ZFJLfVAe; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ZFJLfVAe" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 64BADC43399; Mon, 8 Apr 2024 08:09:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1712563799; bh=JNFc4F167hpSFgsBvPrqtFJbZiF/Fqk7yk6nxtVXHRA=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=ZFJLfVAewQCa4CdnoSF20zK/+VJRD/0CYBVd+/ksicWGIwtsbojBOgZ9HUf5WHZ1f R/S/IZlWOvjQ8gtNqhfZ8Hjc2HrkNIPfzlm+OvFgOuqVNmQRCnCRZ/8nEvFFmAzbVO Xf9wCjzf6ilvfgTOOQE+YS2P9fgdDq01+qu/6HAhSlJgqPVaTrpa3O+X9+8hFqH5MP K/h6FgsY65quopECOF9tu2kNEhF1uA6jH7YUIUFb0behIdLeHVK6OKeH0Yef/6pGNh sgAfsdvQGMzo0gDoHsWdYo+6/DiiDExpN/4Pw4mxQn4vBzyjzyAqD7eQ8Eu/8H4ock ubetS8ez4BIWg== From: bentiss@kernel.org Date: Mon, 08 Apr 2024 10:09:27 +0200 Subject: [PATCH RFC bpf-next v6 2/6] bpf: Add support for KF_ARG_PTR_TO_TIMER Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240408-hid-bpf-sleepable-v6-2-0499ddd91b94@kernel.org> References: <20240408-hid-bpf-sleepable-v6-0-0499ddd91b94@kernel.org> In-Reply-To: <20240408-hid-bpf-sleepable-v6-0-0499ddd91b94@kernel.org> To: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Mykola Lysenko , Shuah Khan Cc: Benjamin Tissoires , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, Kumar Kartikeya Dwivedi X-Mailer: b4 0.12.4 X-Developer-Signature: v=1; a=ed25519-sha256; t=1712563787; l=4204; i=bentiss@kernel.org; s=20230215; h=from:subject:message-id; bh=s0w876z4WCZOEAGRxkVjR4XsfmFuFBrGvAiFCapdFjE=; b=/y+TNn3r9mE73W+o37r2GOBGpi5eXAv161C+UefqHJaSyeki0PLrwsYHsOpyYZ2stCWQdr5NX OtpVNxZ6sVBCFR+kpSwSuyKHUSShBibNb3V6krCnyoro0O5PIS2AvIH X-Developer-Key: i=bentiss@kernel.org; a=ed25519; pk=7D1DyAVh6ajCkuUTudt/chMuXWIJHlv2qCsRkIizvFw= X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC From: Kumar Kartikeya Dwivedi Introduce support for KF_ARG_PTR_TO_TIMER. The kfuncs will use bpf_timer as argument and that will be recognized as timer argument by verifier. bpf_timer_kern casting can happen inside kfunc, but using bpf_timer in argument makes life easier for users who work with non-kern type in BPF progs. Fix up process_timer_func's meta argument usage (ignore if NULL) so that we can share the same checks for helpers and kfuncs. meta argument is only needed to ensure bpf_timer_init's timer and map arguments are coming from the same map (map_uid logic is necessary for correct inner-map handling). No such concerns will be necessary for kfuncs as timer initialization happens using helpers, hence pass NULL to process_timer_func from kfunc argument handling code to ignore it. Signed-off-by: Kumar Kartikeya Dwivedi Signed-off-by: Benjamin Tissoires Acked-by: Eduard Zingerman --- changes in v6: - used Kumar's version of the patch - reverted `+BTF_ID(struct, bpf_timer_kern)` changes in v5: - also check for the reg offset changes in v4: - enforce KF_ARG_PTR_TO_TIMER to be of type PTR_TO_MAP_VALUE new in v3 (split from v2 02/10) --- kernel/bpf/verifier.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index ca6cacf7b42f..ccfe9057d8dc 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -7568,12 +7568,16 @@ static int process_timer_func(struct bpf_verifier_env *env, int regno, val + reg->off, map->record->timer_off); return -EINVAL; } + /* meta is only needed for bpf_timer_init to match timer and map */ + if (!meta) + goto out; if (meta->map_ptr) { verbose(env, "verifier bug. Two map pointers in a timer helper\n"); return -EFAULT; } meta->map_uid = reg->map_uid; meta->map_ptr = map; +out: return 0; } @@ -10826,6 +10830,7 @@ enum { KF_ARG_LIST_NODE_ID, KF_ARG_RB_ROOT_ID, KF_ARG_RB_NODE_ID, + KF_ARG_TIMER_ID, }; BTF_ID_LIST(kf_arg_btf_ids) @@ -10834,6 +10839,7 @@ BTF_ID(struct, bpf_list_head) BTF_ID(struct, bpf_list_node) BTF_ID(struct, bpf_rb_root) BTF_ID(struct, bpf_rb_node) +BTF_ID(struct, bpf_timer_kern) static bool __is_kfunc_ptr_arg_type(const struct btf *btf, const struct btf_param *arg, int type) @@ -10877,6 +10883,11 @@ static bool is_kfunc_arg_rbtree_node(const struct btf *btf, const struct btf_par return __is_kfunc_ptr_arg_type(btf, arg, KF_ARG_RB_NODE_ID); } +static bool is_kfunc_arg_timer(const struct btf *btf, const struct btf_param *arg) +{ + return __is_kfunc_ptr_arg_type(btf, arg, KF_ARG_TIMER_ID); +} + static bool is_kfunc_arg_callback(struct bpf_verifier_env *env, const struct btf *btf, const struct btf_param *arg) { @@ -10946,6 +10957,7 @@ enum kfunc_ptr_arg_type { KF_ARG_PTR_TO_NULL, KF_ARG_PTR_TO_CONST_STR, KF_ARG_PTR_TO_MAP, + KF_ARG_PTR_TO_TIMER, }; enum special_kfunc_type { @@ -11102,6 +11114,9 @@ get_kfunc_ptr_arg_type(struct bpf_verifier_env *env, if (is_kfunc_arg_map(meta->btf, &args[argno])) return KF_ARG_PTR_TO_MAP; + if (is_kfunc_arg_timer(meta->btf, &args[argno])) + return KF_ARG_PTR_TO_TIMER; + if ((base_type(reg->type) == PTR_TO_BTF_ID || reg2btf_ids[base_type(reg->type)])) { if (!btf_type_is_struct(ref_t)) { verbose(env, "kernel function %s args#%d pointer type %s %s is not supported\n", @@ -11735,6 +11750,7 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_ case KF_ARG_PTR_TO_CALLBACK: case KF_ARG_PTR_TO_REFCOUNTED_KPTR: case KF_ARG_PTR_TO_CONST_STR: + case KF_ARG_PTR_TO_TIMER: /* Trusted by default */ break; default: @@ -12021,6 +12037,15 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_ if (ret) return ret; break; + case KF_ARG_PTR_TO_TIMER: + if (reg->type != PTR_TO_MAP_VALUE) { + verbose(env, "arg#%d doesn't point to a map value\n", i); + return -EINVAL; + } + ret = process_timer_func(env, regno, NULL); + if (ret < 0) + return ret; + break; } } From patchwork Mon Apr 8 08:09:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Tissoires X-Patchwork-Id: 13620722 X-Patchwork-Delegate: bpf@iogearbox.net Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 32B622E400; Mon, 8 Apr 2024 08:10:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712563803; cv=none; b=o4Dcp7Y1Kqau+zrPog1BFm1N6oC04IKHqGGmgOue8QUW+1EqxsJc8EyiXPDB+JHiwP2RVtDdeuiqxRq2CO79K28AQRs9/LKuBsqakcp89Otin9piDy20H5u5d8hEuGYU2q/y6yYNL8OrGbt0ZFeFV5ustFbnolSWDs7hthlVPtg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712563803; c=relaxed/simple; bh=pO/fOe12LlMwxLdSxFu7PL1Eaf6knBl1c2DD4TVRdco=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=kMyeFoCVAQ2Qu1QovoPjN9Ev/7m+p5opWfr/yxzJdPXHTIyPatyIn5TE5JmZPIUfaM4hwcz/UeYjrUi9jC5xMHAjgcveWTTG5lQmwaFgfsDfJHPaMSjtdagFmyd0CnM4y6xqHye8Q+1ZZZ/JE1JE+OcyYVkD5BO/2jUDcu2IGN8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Trto9DYG; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Trto9DYG" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 625DCC433A6; Mon, 8 Apr 2024 08:09:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1712563802; bh=pO/fOe12LlMwxLdSxFu7PL1Eaf6knBl1c2DD4TVRdco=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=Trto9DYGhWXG/JLpCFHG6TGltcB581TEuGaOYPozPaJCgLyAlPkG2OLzTW2vw0Pbt oBKv13C1sb1m8iWJTZ/Blhzcm3db+4R1lbKA77mmFtAb+QkoPPbFnWFGlyTvap2gVS Z4JmGFsVLVL7Xio3G+91EvnSEzIcYQYrOh4u3Ha/Yr6gCsutRE/p0kI5NukP5zCS0s eEjRNBoqBgKCPboe2d1ckrIkIPNFqr41iJR9HORUTvd7xKlcBhF1GN2GienMgDy2tR fCcYL/3z9wfFahnQU8S/oudKn9twbtMfH7HNZPAJyIWmNDt+zMAS1/GK7P414PXIkT 94cT+xWC/Gxpg== From: Benjamin Tissoires Date: Mon, 08 Apr 2024 10:09:28 +0200 Subject: [PATCH RFC bpf-next v6 3/6] bpf/helpers: introduce bpf_timer_set_sleepable_cb() kfunc Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240408-hid-bpf-sleepable-v6-3-0499ddd91b94@kernel.org> References: <20240408-hid-bpf-sleepable-v6-0-0499ddd91b94@kernel.org> In-Reply-To: <20240408-hid-bpf-sleepable-v6-0-0499ddd91b94@kernel.org> To: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Mykola Lysenko , Shuah Khan Cc: Benjamin Tissoires , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org X-Mailer: b4 0.12.4 X-Developer-Signature: v=1; a=ed25519-sha256; t=1712563787; l=9197; i=bentiss@kernel.org; s=20230215; h=from:subject:message-id; bh=pO/fOe12LlMwxLdSxFu7PL1Eaf6knBl1c2DD4TVRdco=; b=IXPAZPz6EQvhVIknWcnLDykM+Cc6wqGHJ6/L1lG2uPZeG23G0tNmwC/AGXyqTiOcFSwug7SQY vMFkj+BmaGFBh+yhGLM/VMALme0OsxII3/d7/2CYoVTKph2yP0jE22M X-Developer-Key: i=bentiss@kernel.org; a=ed25519; pk=7D1DyAVh6ajCkuUTudt/chMuXWIJHlv2qCsRkIizvFw= X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC In this patch, bpf_timer_set_sleepable_cb() is functionally equivalent to bpf_timer_set_callback(), to the exception that it enforces the timer to be started with BPF_F_TIMER_SLEEPABLE. But given that bpf_timer_set_callback() is a helper when bpf_timer_set_sleepable_cb() is a kfunc, we need to teach the verifier about its attached callback. Marking that callback as sleepable will be done in a separate patch Acked-by: Eduard Zingerman Signed-off-by: Benjamin Tissoires --- changes in v6: - adapted for flags being set during timer_init changes in v5: - enforced sleepable timers to only have BPF_F_TIMER_SLEEPABLE - use is_bpf_timer_set_sleepable_cb_impl_kfunc() instead of generic is_async_cb() changes in v4: - added a new (ignored) argument to the kfunc so that we do not need to wlak the stack new in v3 (split from v2 02/10) --- kernel/bpf/helpers.c | 43 ++++++++++++++++++++++++++++++++++-- kernel/bpf/verifier.c | 60 +++++++++++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 99 insertions(+), 4 deletions(-) diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index fd05d4358b31..d6528359b3f4 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -1291,8 +1291,8 @@ static const struct bpf_func_proto bpf_timer_init_proto = { .arg3_type = ARG_ANYTHING, }; -BPF_CALL_3(bpf_timer_set_callback, struct bpf_timer_kern *, timer, void *, callback_fn, - struct bpf_prog_aux *, aux) +static int __bpf_timer_set_callback(struct bpf_timer_kern *timer, void *callback_fn, + struct bpf_prog_aux *aux, bool is_sleepable) { struct bpf_prog *prev, *prog = aux->prog; struct bpf_hrtimer *t; @@ -1306,6 +1306,10 @@ BPF_CALL_3(bpf_timer_set_callback, struct bpf_timer_kern *, timer, void *, callb ret = -EINVAL; goto out; } + if (!!(t->flags & BPF_F_TIMER_SLEEPABLE) != is_sleepable) { + ret = -EINVAL; + goto out; + } if (!atomic64_read(&t->map->usercnt)) { /* maps with timers must be either held by user space * or pinned in bpffs. Otherwise timer might still be @@ -1336,6 +1340,12 @@ BPF_CALL_3(bpf_timer_set_callback, struct bpf_timer_kern *, timer, void *, callb return ret; } +BPF_CALL_3(bpf_timer_set_callback, struct bpf_timer_kern *, timer, void *, callback_fn, + struct bpf_prog_aux *, aux) +{ + return __bpf_timer_set_callback(timer, callback_fn, aux, false); +} + static const struct bpf_func_proto bpf_timer_set_callback_proto = { .func = bpf_timer_set_callback, .gpl_only = true, @@ -2650,6 +2660,34 @@ __bpf_kfunc void bpf_throw(u64 cookie) WARN(1, "A call to BPF exception callback should never return\n"); } +/** + * bpf_timer_set_sleepable_cb_impl() - Configure the timer to call %callback_fn + * static function in a sleepable context. + * @timer: The bpf_timer that needs to be configured + * @callback_fn: a static bpf function + * + * @returns %0 on success. %-EINVAL if %timer was not initialized with + * bpf_timer_init() earlier. %-EPERM if %timer is in a map that doesn't + * have any user references. + * The user space should either hold a file descriptor to a map with timers + * or pin such map in bpffs. When map is unpinned or file descriptor is + * closed all timers in the map will be cancelled and freed. + * + * This kfunc is equivalent to %bpf_timer_set_callback except that it tells + * the verifier that the target callback is run in a sleepable context. + */ +__bpf_kfunc int bpf_timer_set_sleepable_cb_impl(struct bpf_timer_kern *timer, + int (callback_fn)(void *map, int *key, struct bpf_timer *timer), + void *aux__ign) +{ + struct bpf_prog_aux *aux = (struct bpf_prog_aux *)aux__ign; + + if (!aux) + return -EINVAL; + + return __bpf_timer_set_callback(timer, (void *)callback_fn, aux, true); +} + __bpf_kfunc_end_defs(); BTF_KFUNCS_START(generic_btf_ids) @@ -2726,6 +2764,7 @@ BTF_ID_FLAGS(func, bpf_dynptr_is_null) BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly) BTF_ID_FLAGS(func, bpf_dynptr_size) BTF_ID_FLAGS(func, bpf_dynptr_clone) +BTF_ID_FLAGS(func, bpf_timer_set_sleepable_cb_impl) BTF_KFUNCS_END(common_btf_ids) static const struct btf_kfunc_id_set common_kfunc_set = { diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index ccfe9057d8dc..00ac3a3a5f01 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -501,8 +501,12 @@ static bool is_dynptr_ref_function(enum bpf_func_id func_id) } static bool is_sync_callback_calling_kfunc(u32 btf_id); +static bool is_async_callback_calling_kfunc(u32 btf_id); +static bool is_callback_calling_kfunc(u32 btf_id); static bool is_bpf_throw_kfunc(struct bpf_insn *insn); +static bool is_bpf_timer_set_sleepable_cb_impl_kfunc(u32 btf_id); + static bool is_sync_callback_calling_function(enum bpf_func_id func_id) { return func_id == BPF_FUNC_for_each_map_elem || @@ -530,7 +534,8 @@ static bool is_sync_callback_calling_insn(struct bpf_insn *insn) static bool is_async_callback_calling_insn(struct bpf_insn *insn) { - return bpf_helper_call(insn) && is_async_callback_calling_function(insn->imm); + return (bpf_helper_call(insn) && is_async_callback_calling_function(insn->imm)) || + (bpf_pseudo_kfunc_call(insn) && is_async_callback_calling_kfunc(insn->imm)); } static bool is_may_goto_insn(struct bpf_insn *insn) @@ -9475,7 +9480,7 @@ static int push_callback_call(struct bpf_verifier_env *env, struct bpf_insn *ins */ env->subprog_info[subprog].is_cb = true; if (bpf_pseudo_kfunc_call(insn) && - !is_sync_callback_calling_kfunc(insn->imm)) { + !is_callback_calling_kfunc(insn->imm)) { verbose(env, "verifier bug: kfunc %s#%d not marked as callback-calling\n", func_id_name(insn->imm), insn->imm); return -EFAULT; @@ -10984,6 +10989,7 @@ enum special_kfunc_type { KF_bpf_percpu_obj_drop_impl, KF_bpf_throw, KF_bpf_iter_css_task_new, + KF_bpf_timer_set_sleepable_cb_impl, }; BTF_SET_START(special_kfunc_set) @@ -11010,6 +11016,7 @@ BTF_ID(func, bpf_throw) #ifdef CONFIG_CGROUPS BTF_ID(func, bpf_iter_css_task_new) #endif +BTF_ID(func, bpf_timer_set_sleepable_cb_impl) BTF_SET_END(special_kfunc_set) BTF_ID_LIST(special_kfunc_list) @@ -11040,6 +11047,7 @@ BTF_ID(func, bpf_iter_css_task_new) #else BTF_ID_UNUSED #endif +BTF_ID(func, bpf_timer_set_sleepable_cb_impl) static bool is_kfunc_ret_null(struct bpf_kfunc_call_arg_meta *meta) { @@ -11368,12 +11376,28 @@ static bool is_sync_callback_calling_kfunc(u32 btf_id) return btf_id == special_kfunc_list[KF_bpf_rbtree_add_impl]; } +static bool is_async_callback_calling_kfunc(u32 btf_id) +{ + return btf_id == special_kfunc_list[KF_bpf_timer_set_sleepable_cb_impl]; +} + static bool is_bpf_throw_kfunc(struct bpf_insn *insn) { return bpf_pseudo_kfunc_call(insn) && insn->off == 0 && insn->imm == special_kfunc_list[KF_bpf_throw]; } +static bool is_bpf_timer_set_sleepable_cb_impl_kfunc(u32 btf_id) +{ + return btf_id == special_kfunc_list[KF_bpf_timer_set_sleepable_cb_impl]; +} + +static bool is_callback_calling_kfunc(u32 btf_id) +{ + return is_sync_callback_calling_kfunc(btf_id) || + is_async_callback_calling_kfunc(btf_id); +} + static bool is_rbtree_lock_required_kfunc(u32 btf_id) { return is_bpf_rbtree_api_kfunc(btf_id); @@ -12157,6 +12181,16 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, } } + if (is_bpf_timer_set_sleepable_cb_impl_kfunc(meta.func_id)) { + err = push_callback_call(env, insn, insn_idx, meta.subprogno, + set_timer_callback_state); + if (err) { + verbose(env, "kfunc %s#%d failed callback verification\n", + func_name, meta.func_id); + return err; + } + } + rcu_lock = is_kfunc_bpf_rcu_read_lock(&meta); rcu_unlock = is_kfunc_bpf_rcu_read_unlock(&meta); @@ -19559,6 +19593,28 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, desc->func_id == special_kfunc_list[KF_bpf_rdonly_cast]) { insn_buf[0] = BPF_MOV64_REG(BPF_REG_0, BPF_REG_1); *cnt = 1; + } else if (is_bpf_timer_set_sleepable_cb_impl_kfunc(desc->func_id)) { + /* The verifier will process callback_fn as many times as necessary + * with different maps and the register states prepared by + * set_timer_callback_state will be accurate. + * + * The following use case is valid: + * map1 is shared by prog1, prog2, prog3. + * prog1 calls bpf_timer_init for some map1 elements + * prog2 calls bpf_timer_set_callback for some map1 elements. + * Those that were not bpf_timer_init-ed will return -EINVAL. + * prog3 calls bpf_timer_start for some map1 elements. + * Those that were not both bpf_timer_init-ed and + * bpf_timer_set_callback-ed will return -EINVAL. + */ + struct bpf_insn ld_addrs[2] = { + BPF_LD_IMM64(BPF_REG_3, (long)env->prog->aux), + }; + + insn_buf[0] = ld_addrs[0]; + insn_buf[1] = ld_addrs[1]; + insn_buf[2] = *insn; + *cnt = 3; } return 0; } From patchwork Mon Apr 8 08:09:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Tissoires X-Patchwork-Id: 13620723 X-Patchwork-Delegate: bpf@iogearbox.net Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B779644C71; Mon, 8 Apr 2024 08:10:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712563806; cv=none; b=j/bn6K2b1hpOrnifKkysof6lV3sep8ASb1TGs+aKFdDHcomPV/kFr9jHjJwQJVnjBj+jEW6AI2dFOkbHCRIdOkf4cExwKI5T5qLwr6JDKZOGaJrPHgVKyIdNNU7uG7pb5MVyDigSM1HGxc4T9Umr2/eHMEepnwY/q6UlqNPJ+0k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712563806; c=relaxed/simple; bh=F6xXyYbfZDX1T1rGuNQBL+pHxwN2Kf/niVJyhWfdoQU=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=m6I7j8P5Jp1rcWqu9F0X5w5qOgHA5z53hEIhuFGcK/YRnruYQerqpixDOS2VW5ydncRgmYssR7JX3IPD/RpZZAAohUS1sENfmPFkbIB9/5VSuSlQo5T7i37yViakpkeImFZmZ4GmdRPN2Hfz2PDh3wROfiG7Z6Im4IDHHXxcnHc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=aCjsAwN5; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="aCjsAwN5" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 364F8C433C7; Mon, 8 Apr 2024 08:10:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1712563806; bh=F6xXyYbfZDX1T1rGuNQBL+pHxwN2Kf/niVJyhWfdoQU=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=aCjsAwN5fiGJQ66awNc9vEq9aXNcRHSp8sO5tM2TRsvsyLVwhsxAT9N82VNcy02Ja VgY17UNED+3hgLMj7M8i1qCK96KuA6dW3ryz9ssOi6JqKTIogKtmTAMtn6B/KxZA7e ucE2eVewBtuYhigcL949wiHKgPYcCNISbOeYklyvOUWYUpVptj3BGpmzmG64w4SYaU WbAFO4DmMWEuDi5NVFNz8qrQq9sSrLifLJKFkfRXoZlrwo9KuVN5D1BGeM9XGsqzM8 K2J+70xJ8NdEPvw4wO1oCp99glFytyDsFTIyywmqP2cl3qr1w3CohaELUPEgSmfEtc bOeLFvepiJ6GA== From: Benjamin Tissoires Date: Mon, 08 Apr 2024 10:09:29 +0200 Subject: [PATCH RFC bpf-next v6 4/6] bpf/helpers: mark the callback of bpf_timer_set_sleepable_cb() as sleepable Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240408-hid-bpf-sleepable-v6-4-0499ddd91b94@kernel.org> References: <20240408-hid-bpf-sleepable-v6-0-0499ddd91b94@kernel.org> In-Reply-To: <20240408-hid-bpf-sleepable-v6-0-0499ddd91b94@kernel.org> To: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Mykola Lysenko , Shuah Khan Cc: Benjamin Tissoires , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org X-Mailer: b4 0.12.4 X-Developer-Signature: v=1; a=ed25519-sha256; t=1712563787; l=3425; i=bentiss@kernel.org; s=20230215; h=from:subject:message-id; bh=F6xXyYbfZDX1T1rGuNQBL+pHxwN2Kf/niVJyhWfdoQU=; b=r5nlEnLIcuh7ege37WQb6PvbynSZjHTTOu1bMBt4RMC5kwYK0dCnCNKZF/rGk7SLnVvuz4gTH FlWCkTvU3IXA/m+YrkTpbYYyIJVmVn3Dw/np1qSvmAHxM7Hjdx/v1re X-Developer-Key: i=bentiss@kernel.org; a=ed25519; pk=7D1DyAVh6ajCkuUTudt/chMuXWIJHlv2qCsRkIizvFw= X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC Now that we have bpf_timer_set_sleepable_cb() available and working, we can tag the attached callback as sleepable, and let the verifier check in the correct context the calls and kfuncs. Signed-off-by: Benjamin Tissoires Acked-by: Eduard Zingerman --- no changes in v6 no changes in v5 changes in v4: - use a function parameter to forward the sleepable information new in v3 (split from v2 02/10) --- include/linux/bpf_verifier.h | 1 + kernel/bpf/verifier.c | 13 ++++++++++--- 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 7cb1b75eee38..14e4ee67b694 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -426,6 +426,7 @@ struct bpf_verifier_state { * while they are still in use. */ bool used_as_loop_entry; + bool in_sleepable; /* first and last insn idx of this verifier state */ u32 first_insn_idx; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 00ac3a3a5f01..b75a8aca6ec9 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1434,6 +1434,7 @@ static int copy_verifier_state(struct bpf_verifier_state *dst_state, } dst_state->speculative = src->speculative; dst_state->active_rcu_lock = src->active_rcu_lock; + dst_state->in_sleepable = src->in_sleepable; dst_state->curframe = src->curframe; dst_state->active_lock.ptr = src->active_lock.ptr; dst_state->active_lock.id = src->active_lock.id; @@ -2407,7 +2408,7 @@ static void init_func_state(struct bpf_verifier_env *env, /* Similar to push_stack(), but for async callbacks */ static struct bpf_verifier_state *push_async_cb(struct bpf_verifier_env *env, int insn_idx, int prev_insn_idx, - int subprog) + int subprog, bool is_sleepable) { struct bpf_verifier_stack_elem *elem; struct bpf_func_state *frame; @@ -2434,6 +2435,7 @@ static struct bpf_verifier_state *push_async_cb(struct bpf_verifier_env *env, * Initialize it similar to do_check_common(). */ elem->st.branches = 1; + elem->st.in_sleepable = is_sleepable; frame = kzalloc(sizeof(*frame), GFP_KERNEL); if (!frame) goto err; @@ -5279,7 +5281,8 @@ static int map_kptr_match_type(struct bpf_verifier_env *env, static bool in_sleepable(struct bpf_verifier_env *env) { - return env->prog->sleepable; + return env->prog->sleepable || + (env->cur_state && env->cur_state->in_sleepable); } /* The non-sleepable programs and sleepable programs with explicit bpf_rcu_read_lock() @@ -9497,7 +9500,8 @@ static int push_callback_call(struct bpf_verifier_env *env, struct bpf_insn *ins /* there is no real recursion here. timer callbacks are async */ env->subprog_info[subprog].is_async_cb = true; async_cb = push_async_cb(env, env->subprog_info[subprog].start, - insn_idx, subprog); + insn_idx, subprog, + is_bpf_timer_set_sleepable_cb_impl_kfunc(insn->imm)); if (!async_cb) return -EFAULT; callee = async_cb->frame[0]; @@ -16947,6 +16951,9 @@ static bool states_equal(struct bpf_verifier_env *env, if (old->active_rcu_lock != cur->active_rcu_lock) return false; + if (old->in_sleepable != cur->in_sleepable) + return false; + /* for states to be equal callsites have to be the same * and all frame states need to be equivalent */ From patchwork Mon Apr 8 08:09:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Tissoires X-Patchwork-Id: 13620724 X-Patchwork-Delegate: bpf@iogearbox.net Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ACA524AEC4; Mon, 8 Apr 2024 08:10:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712563810; cv=none; b=D2GUs9qhYj+ZOpE0j6CAfD6XbnYSysb4QlRf1g0m9+1hWiBnN3C2gGux9zkFrTm1cC7rv38IHHpB13OTxbchnAMbJZJs9ZCSCsO1HGfm8W+IW7h6xG9PhwNi1WZJ1RsTgue1ILJqJunt4ALKRpvLEJuwJGL6OEFpVFnfSrMTNaU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712563810; c=relaxed/simple; bh=635XJV7/rS2AZSR2TOFYFdxkBgPz6B2TWBzQ/ySmu9M=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=KMTaaaou7wTuv84Gt4ICP75aDRXPCyWyN6lbn1FocfcX+tOPwgUEohOc5zCdbrDdnx6kd4lxW+93E0uGbJzn8HiiWuyu9JUeeI6Rkw1TVPDsNYeiZTmu28rD2f6bmuztmgtCcoqFdbBWo0kbOHFywJzVh78RHtSbv/t4XsdM0jQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=rcmzYzSC; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="rcmzYzSC" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 080A2C43399; Mon, 8 Apr 2024 08:10:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1712563810; bh=635XJV7/rS2AZSR2TOFYFdxkBgPz6B2TWBzQ/ySmu9M=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=rcmzYzSCS2UQ2yznV+tN4pM6lfJYvCiY03U13zRxlWbiIotEWFvWdSs/qgVhMI1yd rNTsHKNniThyXgmiB+e4VlzhPaZ7b8T4KfJfINbTfnBj8JVTMhcMURbifUt4tzQZNL mzbt8W+LRY18yOXl580XOf6PByqx8Pz4uTY/dcdHFX3iXBA5xmKEkvzKedKX3Firl1 mLH5KPeB3HwIsqCZwoRdO1K/cvKCHvV5KjgLnElda611Oh9ONye5pDAwWV4VaLMJZN JTjEicFtMXx/Bt3JdGpYzyPfRiuEWWuqKVFwkRvVt3z4OjSoFPZoBq9RLkiBzjP4tF 0PbWKFnIQcuwA== From: Benjamin Tissoires Date: Mon, 08 Apr 2024 10:09:30 +0200 Subject: [PATCH RFC bpf-next v6 5/6] tools: sync include/uapi/linux/bpf.h Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240408-hid-bpf-sleepable-v6-5-0499ddd91b94@kernel.org> References: <20240408-hid-bpf-sleepable-v6-0-0499ddd91b94@kernel.org> In-Reply-To: <20240408-hid-bpf-sleepable-v6-0-0499ddd91b94@kernel.org> To: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Mykola Lysenko , Shuah Khan Cc: Benjamin Tissoires , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org X-Mailer: b4 0.12.4 X-Developer-Signature: v=1; a=ed25519-sha256; t=1712563787; l=1530; i=bentiss@kernel.org; s=20230215; h=from:subject:message-id; bh=635XJV7/rS2AZSR2TOFYFdxkBgPz6B2TWBzQ/ySmu9M=; b=3a7SZ7xD0328vUxDQYvfQWMfPdppH06CJBZk9RnsI5LibRtnAfsK7nV4ruTCSR80t86Vx78ab Yp4jph6hShYCpu03ot6opzyxC/9gplP3l9T/T0NKywzNn+3k3Ew5yFH X-Developer-Key: i=bentiss@kernel.org; a=ed25519; pk=7D1DyAVh6ajCkuUTudt/chMuXWIJHlv2qCsRkIizvFw= X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC cp include/uapi/linux/bpf.h tools/include/uapi/linux/bpf.h Signed-off-by: Benjamin Tissoires --- updated in v6 no changes in v5 new in v4 --- tools/include/uapi/linux/bpf.h | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index bf80b614c4db..f1890eed213a 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -1662,9 +1662,10 @@ union bpf_attr { } query; struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */ - __u64 name; - __u32 prog_fd; - __aligned_u64 cookie; + __u64 name; + __u32 prog_fd; + __u32 :32; + __aligned_u64 cookie; } raw_tracepoint; struct { /* anonymous struct for BPF_BTF_LOAD */ @@ -7468,6 +7469,19 @@ enum { BPF_F_TIMER_CPU_PIN = (1ULL << 1), }; +/* Extra flags to control bpf_timer_init() behaviour, in addition to CLOCK_*. + * - BPF_F_TIMER_SLEEPABLE: Timer will run in a workqueue in a sleepable + * context, with no guarantees of ordering nor timing (consider this as + * being just offloaded immediately). + */ +enum { + /* CLOCK_* are using bits 0 to 3 */ + BPF_F_TIMER_SLEEPABLE = (1ULL << 4), + __MAX_BPF_F_TIMER_INIT, +}; + +#define MAX_BPF_F_TIMER_INIT __MAX_BPF_F_TIMER_INIT + /* BPF numbers iterator state */ struct bpf_iter_num { /* opaque iterator state; having __u64 here allows to preserve correct From patchwork Mon Apr 8 08:09:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Tissoires X-Patchwork-Id: 13620725 X-Patchwork-Delegate: bpf@iogearbox.net Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8727550261; Mon, 8 Apr 2024 08:10:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712563814; cv=none; b=BkH3oZyv5kE793ikygLvdYCBB39oQ/uPi245+Hxj9S6Bh9oVMRhzxgDF3vfqafo61/m2iagPeDtWG16fO1L1WXC6qgBAdl5B+P0F17dZOIW3MRi5W6Kfyofh75nSTO10mF7/EIrLonvDf3LiJE4gLTTzDrz+sPrI8sfSpRv+txk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712563814; c=relaxed/simple; bh=VK5TMS1RBt4n3wTN5emOMrl/EafXnUKW81ujAeZ/zaQ=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=tKUNF/gM7iua3JR3zZadGJ2E0Iqs15tKvComVVnQYiMzECnrhG0Z21jWCd8GasKndFk7YGF80q/3S/OGFgQYaJhHwzoMmsy0dZ9gUgJGbQZx4AQcTjaJFqeGDRwf+oFwh7GR0B2IMLPTLsJZS9NrjFWWRNMy+TZOyrhiCj5C6cc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=aly/g6MA; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="aly/g6MA" Received: by smtp.kernel.org (Postfix) with ESMTPSA id CD2F1C433C7; Mon, 8 Apr 2024 08:10:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1712563814; bh=VK5TMS1RBt4n3wTN5emOMrl/EafXnUKW81ujAeZ/zaQ=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=aly/g6MAD0Joz022TK5sZ9ySr8NS3Gg2ZjmhgV9eUlJJq2Th/WS1AvTi0gHlnqWAD M7z1TjSgkjqnnf8o80InKUs1yDnwvKB0ZtgMxDdSbL59po7T4zGGw7xsKd5zErHN+M AwzoDToAUCDwxJwkxwhS/Tz2lOEc2Dz/IDfN3FQoyuXHTeXM8DxFZzyORFxcDsU+JZ +PMKfmVf9TPJJP7THWRxp2Adt4qHJBJQzriHT+Tf9mbsgTJ+GmRw5dDP1s35yLYAJg zzXpQmwlSSrvMEDoF8rpyBRt0MVRNqXmbrDybjuNAKafjNFinJZj8OZzpYhkefd4ls lmuNsrdOsIrEQ== From: Benjamin Tissoires Date: Mon, 08 Apr 2024 10:09:31 +0200 Subject: [PATCH RFC bpf-next v6 6/6] selftests/bpf: add sleepable timer tests Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240408-hid-bpf-sleepable-v6-6-0499ddd91b94@kernel.org> References: <20240408-hid-bpf-sleepable-v6-0-0499ddd91b94@kernel.org> In-Reply-To: <20240408-hid-bpf-sleepable-v6-0-0499ddd91b94@kernel.org> To: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Mykola Lysenko , Shuah Khan Cc: Benjamin Tissoires , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org X-Mailer: b4 0.12.4 X-Developer-Signature: v=1; a=ed25519-sha256; t=1712563787; l=10863; i=bentiss@kernel.org; s=20230215; h=from:subject:message-id; bh=VK5TMS1RBt4n3wTN5emOMrl/EafXnUKW81ujAeZ/zaQ=; b=DkDztKSSM7lLUnbAHdGqP2rfKfHFrniK23AmOurdhWu2n1ngDjhSbKcRfGY08H+huRvSX8YNF DafjdkIw+xHDY0WMCUKiwZoi0dO0lrq27KPkm5BiTlAhm39gffESED5 X-Developer-Key: i=bentiss@kernel.org; a=ed25519; pk=7D1DyAVh6ajCkuUTudt/chMuXWIJHlv2qCsRkIizvFw= X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC bpf_experimental.h and ../bpf_testmod/bpf_testmod_kfunc.h are both including vmlinux.h, which is not compatible with including time.h or bpf_tcp_helpers.h. So keep sleepable tests in a separate bpf source file. The first correct test is run twice for convenience: - first through RUN_TESTS - then we ensure that the timer was actually executed, in a manual load/attach/run Signed-off-by: Benjamin Tissoires Acked-by: Eduard Zingerman --- changes in v6: - use of new _init API with BPF_F_TIMER_SLEEPABLE changes in v5b: - do not use SEC(syscall) as they are run in a sleepable context already - duplicate test_call_sleepable() so we can test if the callback gets called changes in v5: - keep sleepable test separate new in v4 --- tools/testing/selftests/bpf/bpf_experimental.h | 5 + .../selftests/bpf/bpf_testmod/bpf_testmod.c | 5 + .../selftests/bpf/bpf_testmod/bpf_testmod_kfunc.h | 1 + tools/testing/selftests/bpf/prog_tests/timer.c | 34 ++++ .../testing/selftests/bpf/progs/timer_sleepable.c | 213 +++++++++++++++++++++ 5 files changed, 258 insertions(+) diff --git a/tools/testing/selftests/bpf/bpf_experimental.h b/tools/testing/selftests/bpf/bpf_experimental.h index a5b9df38c162..06ed98b3bc51 100644 --- a/tools/testing/selftests/bpf/bpf_experimental.h +++ b/tools/testing/selftests/bpf/bpf_experimental.h @@ -459,4 +459,9 @@ extern int bpf_iter_css_new(struct bpf_iter_css *it, extern struct cgroup_subsys_state *bpf_iter_css_next(struct bpf_iter_css *it) __weak __ksym; extern void bpf_iter_css_destroy(struct bpf_iter_css *it) __weak __ksym; +extern int bpf_timer_set_sleepable_cb_impl(struct bpf_timer *timer, + int (callback_fn)(void *map, int *key, struct bpf_timer *timer), + void *aux__ign) __ksym; +#define bpf_timer_set_sleepable_cb(timer, cb) \ + bpf_timer_set_sleepable_cb_impl(timer, cb, NULL) #endif diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c index 39ad96a18123..eb2b78552ca2 100644 --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c @@ -494,6 +494,10 @@ __bpf_kfunc static u32 bpf_kfunc_call_test_static_unused_arg(u32 arg, u32 unused return arg; } +__bpf_kfunc void bpf_kfunc_call_test_sleepable(void) +{ +} + BTF_KFUNCS_START(bpf_testmod_check_kfunc_ids) BTF_ID_FLAGS(func, bpf_testmod_test_mod_kfunc) BTF_ID_FLAGS(func, bpf_kfunc_call_test1) @@ -520,6 +524,7 @@ BTF_ID_FLAGS(func, bpf_kfunc_call_test_ref, KF_TRUSTED_ARGS | KF_RCU) BTF_ID_FLAGS(func, bpf_kfunc_call_test_destructive, KF_DESTRUCTIVE) BTF_ID_FLAGS(func, bpf_kfunc_call_test_static_unused_arg) BTF_ID_FLAGS(func, bpf_kfunc_call_test_offset) +BTF_ID_FLAGS(func, bpf_kfunc_call_test_sleepable, KF_SLEEPABLE) BTF_KFUNCS_END(bpf_testmod_check_kfunc_ids) static int bpf_testmod_ops_init(struct btf *btf) diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod_kfunc.h b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod_kfunc.h index 7c664dd61059..ce5cd763561c 100644 --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod_kfunc.h +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod_kfunc.h @@ -96,6 +96,7 @@ void bpf_kfunc_call_test_pass2(struct prog_test_pass2 *p) __ksym; void bpf_kfunc_call_test_mem_len_fail2(__u64 *mem, int len) __ksym; void bpf_kfunc_call_test_destructive(void) __ksym; +void bpf_kfunc_call_test_sleepable(void) __ksym; void bpf_kfunc_call_test_offset(struct prog_test_ref_kfunc *p); struct prog_test_member *bpf_kfunc_call_memb_acquire(void); diff --git a/tools/testing/selftests/bpf/prog_tests/timer.c b/tools/testing/selftests/bpf/prog_tests/timer.c index d66687f1ee6a..c6c7c623b31c 100644 --- a/tools/testing/selftests/bpf/prog_tests/timer.c +++ b/tools/testing/selftests/bpf/prog_tests/timer.c @@ -3,6 +3,7 @@ #include #include "timer.skel.h" #include "timer_failure.skel.h" +#include "timer_sleepable.skel.h" #define NUM_THR 8 @@ -95,3 +96,36 @@ void serial_test_timer(void) RUN_TESTS(timer_failure); } + +/* isolate sleepable tests from main timer tests + * because if test_timer fails, it spews the console + * with 10000 * 5 "spin_lock_thread:PASS:test_run_opts retval 0 nsec" + */ +void serial_test_sleepable_timer(void) +{ + struct timer_sleepable *timer_sleepable_skel = NULL; + int err, prog_fd; + + LIBBPF_OPTS(bpf_test_run_opts, topts); + + RUN_TESTS(timer_sleepable); + + /* re-run the success test to check if the timer was actually executed */ + + timer_sleepable_skel = timer_sleepable__open_and_load(); + if (!ASSERT_OK_PTR(timer_sleepable_skel, "timer_sleepable_skel_load")) + return; + + err = timer_sleepable__attach(timer_sleepable_skel); + if (!ASSERT_OK(err, "timer_sleepable_attach")) + return; + + prog_fd = bpf_program__fd(timer_sleepable_skel->progs.test_syscall_sleepable); + err = bpf_prog_test_run_opts(prog_fd, &topts); + ASSERT_OK(err, "test_run"); + ASSERT_EQ(topts.retval, 0, "test_run"); + + usleep(50); /* 10 usecs should be enough, but give it extra */ + + ASSERT_EQ(timer_sleepable_skel->bss->ok_sleepable, 1, "ok_sleepable"); +} diff --git a/tools/testing/selftests/bpf/progs/timer_sleepable.c b/tools/testing/selftests/bpf/progs/timer_sleepable.c new file mode 100644 index 000000000000..fc7829d8b6c4 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/timer_sleepable.c @@ -0,0 +1,213 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2021 Facebook + * Copyright (c) 2024 Benjamin Tissoires + */ + +#include "bpf_experimental.h" +#include +#include "bpf_misc.h" +#include "../bpf_testmod/bpf_testmod_kfunc.h" + +char _license[] SEC("license") = "GPL"; + +#define CLOCK_MONOTONIC 1 + +struct elem { + struct bpf_timer t; +}; + +struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __uint(max_entries, 3); + __type(key, int); + __type(value, struct elem); +} timer_map SEC(".maps"); + +__u32 ok_sleepable; + +/* callback for sleepable timer */ +static int timer_cb_sleepable(void *map, int *key, struct bpf_timer *timer) +{ + bpf_kfunc_call_test_sleepable(); + ok_sleepable |= (1 << *key); + return 0; +} + +SEC("tc") +/* check that calling bpf_timer_start() with BPF_F_TIMER_SLEEPABLE on a sleepable + * callback works + */ +__retval(0) +long test_call_sleepable(void *ctx) +{ + int key = 0; + struct bpf_timer *timer; + + if (ok_sleepable & 1) + return -1; + + timer = bpf_map_lookup_elem(&timer_map, &key); + if (timer) { + if (bpf_timer_init(timer, &timer_map, CLOCK_MONOTONIC | BPF_F_TIMER_SLEEPABLE) != 0) + return -2; + bpf_timer_set_sleepable_cb(timer, timer_cb_sleepable); + if (bpf_timer_start(timer, 0, 0)) + return -3; + } else { + return -4; + } + + return 0; +} + +SEC("syscall") +/* check that calling bpf_timer_start() with BPF_F_TIMER_SLEEPABLE on a sleepable + * callback works. + */ +__retval(0) +long test_syscall_sleepable(void *ctx) +{ + int key = 0; + struct bpf_timer *timer; + + if (ok_sleepable & 1) + return -1; + + timer = bpf_map_lookup_elem(&timer_map, &key); + if (timer) { + if (bpf_timer_init(timer, &timer_map, CLOCK_MONOTONIC | BPF_F_TIMER_SLEEPABLE) != 0) + return -2; + bpf_timer_set_sleepable_cb(timer, timer_cb_sleepable); + if (bpf_timer_start(timer, 0, 0)) + return -3; + } else { + return -4; + } + + return 0; +} + +SEC("?tc") +__log_level(2) +__failure +/* check that bpf_timer_set_callback() can not be called with a + * sleepable callback + */ +__msg("mark_precise: frame0: regs=r1 stack= before") +__msg(": (85) call bpf_kfunc_call_test_sleepable#") /* anchor message */ +__msg("program must be sleepable to call sleepable kfunc bpf_kfunc_call_test_sleepable") +long test_non_sleepable_sleepable_callback(void *ctx) +{ + int key = 0; + struct bpf_timer *timer; + + timer = bpf_map_lookup_elem(&timer_map, &key); + if (timer) { + bpf_timer_init(timer, &timer_map, CLOCK_MONOTONIC | BPF_F_TIMER_SLEEPABLE); + bpf_timer_set_callback(timer, timer_cb_sleepable); + bpf_timer_start(timer, 0, 0); + } + + return 0; +} + +SEC("tc") +/* check that calling bpf_timer_set_sleepable_cb() without a sleepable timer + * initialized with BPF_F_TIMER_SLEEPABLE on a sleepable + * callback is returning -EINVAL + */ +__retval(3) +long test_call_sleepable_missing_flag(void *ctx) +{ + int key = 1; + struct bpf_timer *timer; + + timer = bpf_map_lookup_elem(&timer_map, &key); + if (!timer) + return 1; + + if (bpf_timer_init(timer, &timer_map, CLOCK_MONOTONIC)) + return 2; + + if (bpf_timer_set_sleepable_cb(timer, timer_cb_sleepable)) + return 3; + + return bpf_timer_start(timer, 0, 0); +} + +SEC("tc") +/* check that calling bpf_timer_start() with a delay on a sleepable + * callback is returning -EINVAL + */ +__retval(-22) +long test_call_sleepable_delay(void *ctx) +{ + int key = 2; + struct bpf_timer *timer; + + timer = bpf_map_lookup_elem(&timer_map, &key); + if (!timer) + return 1; + + if (bpf_timer_init(timer, &timer_map, CLOCK_MONOTONIC | BPF_F_TIMER_SLEEPABLE)) + return 2; + + if (bpf_timer_set_sleepable_cb(timer, timer_cb_sleepable)) + return 3; + + return bpf_timer_start(timer, 1, 0); +} + +SEC("?tc") +__log_level(2) +__failure +/* check that the first argument of bpf_timer_set_callback() + * is a correct bpf_timer pointer. + */ +__msg("mark_precise: frame0: regs=r1 stack= before") +__msg(": (85) call bpf_timer_set_sleepable_cb_impl#") /* anchor message */ +__msg("arg#0 doesn't point to a map value") +long test_wrong_pointer(void *ctx) +{ + int key = 0; + struct bpf_timer *timer; + + timer = bpf_map_lookup_elem(&timer_map, &key); + if (!timer) + return 1; + + if (bpf_timer_init(timer, &timer_map, CLOCK_MONOTONIC | BPF_F_TIMER_SLEEPABLE)) + return 2; + + if (bpf_timer_set_sleepable_cb((void *)&timer, timer_cb_sleepable)) + return 3; + + return -22; +} + +SEC("?tc") +__log_level(2) +__failure +/* check that the first argument of bpf_timer_set_callback() + * is a correct bpf_timer pointer. + */ +__msg("mark_precise: frame0: regs=r1 stack= before") +__msg(": (85) call bpf_timer_set_sleepable_cb_impl#") /* anchor message */ +__msg("off 1 doesn't point to 'struct bpf_timer' that is at 0") +long test_wrong_pointer_offset(void *ctx) +{ + int key = 0; + struct bpf_timer *timer; + + timer = bpf_map_lookup_elem(&timer_map, &key); + if (!timer) + return 1; + + if (bpf_timer_init(timer, &timer_map, CLOCK_MONOTONIC | BPF_F_TIMER_SLEEPABLE)) + return 2; + + if (bpf_timer_set_sleepable_cb((void *)timer + 1, timer_cb_sleepable)) + return 3; + + return -22; +}